Single Prometheus restarted after it ups for 2 weeks

I’m using docker-compose to run a single Prometheus, everything goes well in the first two weeks;
then I saw the Prometheus container restart while grafana and alertmanager container were up in a few days.
Everytime I delete the wal, The memory usage occpied by Prometheus will increase from 4.4g to 5.3g then crashed
If there’s anyway to reduce memory usage of Prometheus or my Prometheus instance needs more RAM?

My total RAM is 7.6g

Thanks in advanced!

Memory usage is mostly determined by the amount of metrics being scraped. So to reduce the memory usage you can scrape less hosts, reduce scrape frequency, have less labels or less metrics.

One way that this is commonly achieved is to shard by functional areas and run multiple servers.

Alternatively if you want all the metrics you will need to increase the amount of memory available.

Thanks for reply,
Yes! I do want all of the metrics
Maybe deleteing the expired data periodically can help?

The length of time metrics are stored for has minimal impact on memory usage (mostly it will change how much disk you are using). As mentioned the main contributor for memory usage will be the scrapes, so in this case you’d need to allocate more memory to Prometheus.

You are right, I checked my Prometheus profile then I found there are more than 2000 requests in my job eg. * [kube-apiserver (0 / 2259 active targets)] , but we only have 3 apiserver pods in our cluster, It’s weird.
Here is my job for ‘kube-apiserver’ ,any suggestion?

  • job_name: ‘kube-apiserver’
  • role: endpoints
    bearer_token_file: /etc/prometheus/k8s.token
    insecure_skip_verify: true
    bearer_token_file: /etc/prometheus/k8s.token
    scheme: https
    insecure_skip_verify: true
  • source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
    action: keep
    regex: default;kubernetes;https