Prometheus for wide clusters about 1000 nodes


We have a problem with large cluster using only 1 prometheus. (cost)

Cluster: 1000+ nodes
Prometheus: 1 node with around 60GB memory (Still OOM) when doing select in Kiali and Grafana
Thanos: Thanos-Sidecar
Istio, Kiali, Grafana and Jaeger

Large cluster (1000 nodes+) with 1 Prometheus gets OOM due to large metrics.

We are planning to implement below:

Design approach:
Fire up multiple prometheus for each namespace with same scrape configs and just use thanos-query for grafana (deduplication)

Issue with this approach:
Kiali gets data directly from Prometheus (not thanos) (which in this case multiple prometheus and duplicates metrics)

Sadly, removing unnecessary metrics are not yet as an option right now in the organization.

Appreciate any suggestion for better approach to take.

Hi @plee ,
How did you deal with such case?