I am trying to have a generic platform where the teams who wish not to expose the metrics directly , want to push the metrics to a generic processor from whom the metrics will be scraped . Push gateway is exactly doing that , however it says that its recommended for only short lived jobs like batch jobs etc .
Now the questions i have are :
- Can I use the push gateway for the purpose I mentioned ? Are there any shortcomings ?
- All my workloads run on Kubernetes . Can push gateway scale beyond a single pod ?
- Does it internally aggregate any metrics ?
Push Gateway is designed for short lived jobs as you mention, such as scheduled tasks (cron).
From what you are saying it sounds like something like Grafana Agent Operator or Prometheus in agent mode might be what you are wanting.
Fundamentally Prometheus is designed to be a pull based system, so the expectation is that applications expose metrics to be scraped.
Thanks for replying @stuart . The idea is for applications who are not willing to expose the metrics , instead are ok to push the metrics to an end point or a message broker. Will the push gateway in that case solve the problem ? Any downsides ?
That is actively against the ideas behind Prometheus.
You would lose the use of the “up” metric, which means you won’t be able to detect some form of failures as easily. It also means things like scrape intervals need to be carefully matched with the application as it is no longer the Prometheus server(s) which are in sole control of metric fetching. You also have additional failure/error modes as the gateway failing or being misconfigured could prevent metrics flowing.
Agree with you . However, Since my workloads are k8s based we have alternate means for the uptime metric . In general , can the push gateway scale beyond a single pod and still write the results correctly ? , Any suggestions of a push based model ? I know its against the theory of prometheus . But we are limited with the choice .
A post was split to a new topic: Use of Push Gateway