Determining the capacity of a single prometheus server - To scrape multiple VM's

prabhara · August 16, 2023, 7:15pm

I have multiple VM’s (10k to 20k) which need to be monitored via prometheus . We already have installed the agents on them .
Additionally , we have group our vm’s in way so that there are multiple prometheus instance are available for different BU’s.
Also , we are using the Mimir as a long term storage for the metrics.

It would be great if i can get some clear picture on below points.

How to figure out or Judge the capacity of single prometheus server? What are the parameters to calculate the scraping limit of the single prometheus server? . E.g. what if i have a datacenter for a BU having 1k VM servers then how to determine if all of these servers can be scraped via a single prometheus server .

I am aware of the below memory capacity formula but is this holds good to judge the scrape capacity?
Needed_disk_space = retention_time_seconds * ingested_samples_per_second * bytes_per_sample

In case of external remote write (Grafana Mimir) - Again, how to determine the capacity of single prometheus server to scrape the vm’s . → Any documentation or best practice.
How WAL works in case of external remote write presence and affects scrape capacity.
Prometheus cache works with the remote storage.

Topic		Replies	Views
Long Term Storage choice for very large number of metrics Scaling / Clustering / Long-Term Storage	4	2579	April 20, 2023
Prometheus Storage Requirements Scaling / Clustering / Long-Term Storage	2	7083	June 8, 2021
Number of metrics one Prometheus server can handle? Prometheus server	3	2891	May 25, 2021
Prometheus as remote write receiver from prometheus agent Prometheus server	0	829	May 10, 2023
Prometheus system requirements Prometheus server	0	1910	December 11, 2023

Determining the capacity of a single prometheus server - To scrape multiple VM's

Related topics