Count all errors in a single metric

aymericDD · December 23, 2021, 9:57am

Hello Prometheus

Developper team wants a generic metric like errors_total to count all errors. Is such a good idea to have this kind of metric? I am scared about the cardinality because they want to add the name of the exception and other labels.

Thanks by advance for your advices.

stuart · December 23, 2021, 10:34am

To some degree it depends how the code is structured how sensible this is.

One of the ideas around having different metrics for different errors is that it allows cleaner code - instead of having to deal with global objects being passed around for that “errors_total” metric you instead just need class local objects, which are easier & cleaner to deal with.

You also have the advantage of being able to tweak things as needed - some of the classes might want an extra label to break things down in a useful way, that makes no sense elsewhere.

In general I’d look at having something more specific than just a count of general errors. For many situations you can add extra labels to give better insights. For example rather than just a count of HTTP errors you can have a label which has the HTTP status code (and therefore can be useful for more than just error cases too).

aymericDD · December 23, 2021, 10:57am

Thanks for your reply @stuart I am totaly agree with you. The developer teams want to filter errors by type of exception with a centralised view. They also want to compare years for reporting to see if an old error reappears to report improvement of an application or a regression of the application.

Topic		Replies	Views
Question on Error Rate Alert: am I doing it right? General Help/Support	1	439	November 14, 2022
Confused counters with services using Docker Swarm Exporters and Metrics	1	16	January 15, 2025
Error rate above threshold alert PromQL	0	374	November 11, 2022
Multiple Metrics vs. Multiple Lables - How to design good counters? General Help/Support	0	44	December 3, 2024
Using range with label_join PromQL	2	465	October 26, 2022

Count all errors in a single metric

Related topics