Rate calculation results in hugely unexpected exponential values

akikul · December 1, 2021, 4:17pm

A basic description of the issue is that when we are querying rate(api_heartbeat_seconds_bucket{le="1.0",instance="SITE-A:1234"}[2m]) we have unexpected spikes in the rate:
Fig 1

But when we query without the rate the api_heartbeat_seconds_bucket{le="1.0",instance="SITE-A:1234"} the increased steps are smooth:
Fig 2

Background: We were scraping from three different Prometheus nodes back to one Prometheus Federated and we saw that there was a very small difference between the Prometheus nodes; diff of 1 or 2 (let me know if you want to see the data from each node); so decided to scrape one node to test if this made any difference.

We saw that since then we have not had any spikes. Fig 3

It is strange as we have had the multi-target configuration in place for months and it has only become an issue over the last couple of weeks.

FYI:
scrape_interval: 15s
scrape_timeout: 10s
evaluation_interval: 1m

The questions:

is this a coincidence and actually, something else is at play.
Why would the rate extrapolation be so sensitive and unpredictable
I assume that the reason when we query api_heartbeat_seconds_bucket{le="1.0",instance="SITE-A:1234"} as smoothed steps are that there is some smoothing going on when Prometheus runs this query, is this correct?
Could I check something to provide the route cause for the spikes

SuperQ · December 7, 2021, 10:30am

You likely have a small counter value drop, which is resulting in rate() thinking you have a counter reset. This single errant sample is the problem.

If you query api_heartbeat_seconds_bucket{le="1.0",instance="SITE-A:1234"}[10m] with an instant query (table view in the UI), it will return the actual samples and timestamps for debugging.

As for why you are getting this, I don’t know.

akikul · January 19, 2022, 2:40pm

Thanks for the tip. Here is what I have got:

1856408410 @1637544666.282
1856437290 @1637544726.281
1856437290 @1637544726.282
1856466156 @1637544786.281
1856466155 @1637544786.282
1856494989 @1637544846.281
1856494989 @1637544846.282
1856523850 @1637544906.281
1856523850 @1637544906.282
1856552706 @1637544966.281
1856552705 @1637544966.282
1856581550 @1637545026.281
1856581548 @1637545026.282
1856610432 @1637545086.282
1856639309 @1637545146.281
1856639310 @1637545146.282
1856668165 @1637545206.282
1856697018 @1637545266.282
1856725885 @1637545326.282
1856754720 @1637545386.282
1856783600 @1637545446.282
1856812474 @1637545506.282
1856841370 @1637545566.281
1856841370 @1637545566.282
1856870286 @1637545626.281
1856870286 @1637545626.282
1856899171 @1637545686.281
1856899171 @1637545686.282
1856928067 @1637545746.281
1856928067 @1637545746.282
1856956962 @1637545806.282
1856985867 @1637545866.282
1857014787 @1637545926.282
1857043694 @1637545986.282
1857072573 @1637546046.282
1857101486 @1637546106.282
1857130313 @1637546166.281
1857130315 @1637546166.282
1857159196 @1637546226.281
1857159198 @1637546226.282
1857188104 @1637546286.282
1857216986 @1637546346.281
1857216986 @1637546346.282
1857245861 @1637546406.282
1857274774 @1637546466.282
1857303602 @1637546526.281
1857303602 @1637546526.282
1857332498 @1637546586.281
1857332499 @1637546586.282
1857361384 @1637546646.282
1857390280 @1637546706.282
1857419175 @1637546766.282
1857448064 @1637546826.281
1857448064 @1637546826.282
1857476776 @1637546886.281
1857476776 @1637546886.282
1857505645 @1637546946.282
1857534527 @1637547006.282
1857563409 @1637547066.282
1857592243 @1637547126.282
1857621083 @1637547186.281
1857621083 @1637547186.282
1857649933 @1637547246.282
1857678817 @1637547306.282
1857707733 @1637547366.281
1857707733 @1637547366.282
1857736628 @1637547426.281
1857736628 @1637547426.282
1857765509 @1637547486.281
1857765509 @1637547486.282
1857794348 @1637547546.282
1857823177 @1637547606.281
1857823177 @1637547606.282
1857852032 @1637547666.282
1857880902 @1637547726.282
1857909756 @1637547786.282
1857938611 @1637547846.282
1857938610 @1637547846.286
1857967502 @1637547906.282
1857996367 @1637547966.282
1858025273 @1637548026.282
1858054149 @1637548086.282
1858083041 @1637548146.281
1858083041 @1637548146.282
1858111935 @1637548206.282

akikul · February 10, 2022, 1:42pm

I was hoping if there was a way to confirm the behaviour I was seeing that would be great. Not sure if @SuperQ would be able to help. Thanks in advance

Topic		Replies	Views
I am getting graph-behavior after using rate() that I don't understand PromQL	2	356	May 11, 2022
Prometheus increase result problem General Help/Support	0	244	April 19, 2024
Prometheus negative cpu utilization with rate PromQL	1	1608	May 2, 2021
Periodic dropouts Prometheus server	2	205	June 12, 2024
Sudden unexplained drop in scrape_samples_scraped within every 24 hours Prometheus server	0	603	March 28, 2022

Rate calculation results in hugely unexpected exponential values

Related topics