Single Prometheus restarted after it ups for 2 weeks

yuxi · May 16, 2023, 7:22am

Hello！
I’m using docker-compose to run a single Prometheus, everything goes well in the first two weeks;
then I saw the Prometheus container restart while grafana and alertmanager container were up in a few days.
Everytime I delete the wal, The memory usage occpied by Prometheus will increase from 4.4g to 5.3g then crashed
If there’s anyway to reduce memory usage of Prometheus or my Prometheus instance needs more RAM?

My total RAM is 7.6g

Thanks in advanced!

stuart · May 16, 2023, 9:50am

Memory usage is mostly determined by the amount of metrics being scraped. So to reduce the memory usage you can scrape less hosts, reduce scrape frequency, have less labels or less metrics.

One way that this is commonly achieved is to shard by functional areas and run multiple servers.

Alternatively if you want all the metrics you will need to increase the amount of memory available.

yuxi · May 16, 2023, 10:17am

Thanks for reply,
Yes! I do want all of the metrics
Maybe deleteing the expired data periodically can help?

stuart · May 16, 2023, 11:56am

The length of time metrics are stored for has minimal impact on memory usage (mostly it will change how much disk you are using). As mentioned the main contributor for memory usage will be the scrapes, so in this case you’d need to allocate more memory to Prometheus.

yuxi · May 19, 2023, 8:56am

You are right, I checked my Prometheus profile then I found there are more than 2000 requests in my job eg. * [kube-apiserver (0 / 2259 active targets)] , but we only have 3 apiserver pods in our cluster, It’s weird.
Here is my job for ‘kube-apiserver’ ,any suggestion？

job_name: ‘kube-apiserver’
kubernetes_sd_configs:
role: endpoints
api_server: https://10.1.60.140:6443
bearer_token_file: /etc/prometheus/k8s.token
tls_config:
insecure_skip_verify: true
bearer_token_file: /etc/prometheus/k8s.token
scheme: https
tls_config:
insecure_skip_verify: true
relabel_configs:
source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https

Topic		Replies	Views
Reducing memory usage in homelab deployment General Help/Support	1	190	February 17, 2025
Memory usage of prometheus is very high Prometheus server	1	284	November 19, 2024
Kube-prometheus General Help/Support	0	452	June 22, 2021
How to verify CPU Usage and Memory Usage Data from Prometheus data with cluster PromQL	0	135	February 8, 2025
Constantly increasing CPU usage on the Prometheus server Prometheus server	0	1023	June 16, 2024

Single Prometheus restarted after it ups for 2 weeks

Related topics