How to sum up Gauges in PromQL (for alerting)

Hi there,

so I have my own exporter which exposes some gauges:

my_latencies{endpoint="a", metric="p100"} 12
my_latencies{endpoint="a", metric="p50"} 7
my_latencies{endpoint="a", metric="p0"} 1

my_latencies{endpoint="b", metric="p100"} 299

It’s important to know that these gauges are scraped every minute, but not every minute there will be data. So plotted graphs will contain holes…

What I do want to achieve is an alerting when the sum of the latencies breaks a certain threshold. For example the p100 latency for endpoint a is 12 (less than 300) and for endpoint b it is 299 (also less than 300) but the sum 311 is breaking the threshold and should triggern an alert.

I tried to use Grafana for this.

So my first panel shows 4 endpoints, the second panel shows the 4 endpoints but stacked, and I wanted a third panel (with the same outer form as the stacked one) but containing only one entry “total”. I was able to get there using Grafana Transform “Add field from calculation” together with “Replace all fields”. But now alerting becomes impossible due to “Transformations are not supported in alert queries”.

Right now I am hoping to solve this problem only within PromQL.

Some popular example for gauges use subtraction:

node_memory_MemTotal_bytes{...} - node_memory_MemFree_bytes{...}
node_filesystem_size_bytes{...} - node_filesystem_avail_bytes{...}

So I was assuming that something like this should be possible

my_latencies{endpoint="a", metric="p100"} + my_latencies{endpoint="b", metric="p100"}

But I am only getting “Empty query result”.

I think this might be because of the missing data (the scraper only exposes endpoint latencies when the endpoint was consumed - think about users of my system simply sleeping in the night)

Thanks for any help,
Marco Schmitz