NODE_disk_write_latency

Hi, I have a problem that the disk write latency of Prometheus deployment is periodically very high. The latency will be higher than 30ms every 6 hours and last for some time.

I guess it’s the problem of TSDB, the prometheus compresses and writes data to disk at regular intervals.
Reference documents: Storage | Prometheus

Can I reduce disk write latency by modifying the following two parameters ?
–storage.tsdb. min-block-duration
–storage.tsdb. max-block-duration

Or there are other ways to solve this problem ?

What does reducing the disk latency actually solve? Seems like you’re trying to solve a non-problem. Prometheus needs to use the disk to compact the TSDB data. Modifying those flags is just going to make things worse.

Thanks for your help.
We use SSD with NVMe now, so there is a monitoring indicators that node_disk_write_latency need to be less than 16ms.

If disk write latency more than 16ms and up to 30ms, it means that the SSD is in a high load state, or even an abnormal state. We want to reduce the pressure on the disk and keep it in a normal state(write latency less than 16ms).

That sounds like some arbitrary measure you made up. Do you have a source for this “rule”?node_disk_write_latency is not a metric exposed by the node_exporter. Where does this come from?

You’ve set an arbitrary measure for no determinable reason.

“Pressure on the disk” is a meaningless measure, especially if the device is dedicated to a specific task. What you’re saying makes no sense.

This is an alert rule for TiDB cluster, and it will be triggered frequently in my prometheus node.
TiDB Cluster Alert Rules | PingCAP Docs

Those alerts at most indicate you have a defective SSD/NVMe device, not a problem with Prometheus. The “Solutions” provided are not actually solutions.

Also, those alerts may be appropriate for TiDB, but they are not useful for Prometheus.

OK,I can understand what you mean.
I just want to find a way to optimize TSDB, and it could write or compress data more frequently. So it could process less data to reduce the pressure on the disk each time.

Thank you again. If there is no solution here, I’ll try another way.

I think the question still comes down to “why”? Are you seeing any actual issues? It is expected that Prometheus will periodically have times where there is more disk I/O (e.g. when blocks are being written or expired, or when certain queries are happening). In this way the usage will always be “bursty”. In general the default settings are the ones to use and only should be adjusted very carefully - you can very easily make things a lot worse - and should only be done if there are specific problems being caused.

Changing these settings makes things worse. The reason the TSDB does compaction at the frequency it does is to reduce the overall write rate of the system. The write pattern is done to produce the most compact on-disk data. Doing things more frequently would make the data less compressed. This would have a much worse impact as the less compact data would take more memory to use.

Prometheus itself doesn’t actually need NVMe storage. Normal HDD storage is typically good enough. The read/write patterns are not super heavy (as you noticed, it’s just a burst every once in a while. This is because the actual metric processing happens in memory (page cache) rather than depend on heavy IO.

Thanks for your helpful reply. I will refer it and give feedback to my team.
I will change the alert rule instead of modifying these parameters !

Through your reply, I have realized the importance of these parameters.
I will adjust our alert rule.