How to recycle/delete inactive metrics automatically in the prometheus client

As the number of metrics increases, the resource consumption (i.e. memory usage) of the Prometheus client will also increase. Therefore, we hope to make some enhancements on the basis of the Prometheus client to recycle/delete inactive metrics automatically.

The solution we propose now is to find metrics that have been inactive for a period of time (clock time or no-data count when scraped), and then unregister the idle Collector at CollectorRegistry.

We are curious if there are similar proposals or best practice available in the Prometheus community. Is there a way to

  • automatically clean up idle metrics on the client side
  • set an upper limit of the metrics that an application client can generate/register?

P.S. This topic may be labeled as “development” but I am not sure about it :grin:

What do you mean by “idle metrics”?

The metrics that have not generated new data point for a long period (e.g. several queries or days)

There isn’t really such a concept in Prometheus. Every scrape (which is generally say every minute) will produce a new data point in the TSDB (even if the value hasn’t changed). This should be the “current” value of whatever the metric represents - such as number of orders, CPU percentage, HTTP requests, etc.

One thing I’m wondering is if you are trying to use Prometheus to store events, which it isn’t designed for.

These metrics are sparse. For example, the GUAGE metric “HTTP Request success rate” includes URI tags. Some URIs are rarely accessed. Once accessed, they will remain in memory and cannot be automatically
cleaned up . The same value will always be obtained by Prometheus server.

It isn’t recommended to include URIs within labels, as that can lead to a cardinality explosion.

Thank you for your prompt reply :smiley:
We will take a closer look at our solution.