Metric scrape efficiency


I have a bunch of prometheus clients with a large number of metrics, and a server scraping them once every few seconds. The large number of metrics (~3000) results in a 500KB http scrape request. With a large number of clients, this creates a network bottleneck on the server.

Most of the time, the metrics labels/names are identical from one request to another. However they’re passed as plain text on every request, which is inefficient. In theory we only need to pass 3000 doubles over, which should only be 24KB instead of 500KB. This would be a 20x savings on bandwidth.

Is there any existing way to optimize the scrape request size?

A simple architecture would be to hash a string of metric names + labels. The http scrape request (from prometheus server) would pass the hash of that string to the client. If the client’s hash matches, it would respond with updated metrics as byte-serialized doubles (omitting names and labels). Otherwise it would respond with the normal plaint-text request.