Strange behavior when ingesting metrics with custom timestamp

Updates

  • I think this post explains the behavior a bit
  • I still don’t understand why only samples with explicit timestamps are affected

Original post

Hi,

I just started to write my own exporter. Some of my metrics need custom timestamps, because I can only observe them a while after the events actually took place.

My use case

To be precise, i try to export azure pipeline build metrics. A build starts, runs and finishes at some time. My exporter gathers the finished builds and exports their duration as metric. Since a build can take some time, I would like to adjust the timestamp when scraping to the start time of the build.

I am using the python library and currently I am stuck with (seemingly) duplicate data. As soon as I add a timestamp on my own, the data is duplicated for a time (at least visual). Please note that the number of metrics varies.

Reproduce

As this is hard to describe, I created a small example that can reproduce this behavior without dependencies

from prometheus_client import (
    generate_latest,
    REGISTRY,
    CollectorRegistry,
)
from prometheus_client.registry import Collector
from prometheus_client.core import GaugeMetricFamily
from twisted.web.server import Site
from twisted.web.resource import Resource
from twisted.internet import reactor
import random
from datetime import datetime


class CustomCollector(Collector):
    collect_call_counter = 0

    def __init__(self):
        super().__init__()

    def collect(self):
        c = GaugeMetricFamily("my_gauge", "No clue what the heck is going on", labels=["Type"])
        # Just for reference
        c.add_metric(["random"], random.randint(0, 10))
        if self.collect_call_counter == 0:
            # Simulate 'one time' metrics. Once with a custom timestamp, once without.
            c.add_metric(["only-once-with-timestamp"], random.randint(10, 20), datetime.now().timestamp())
            c.add_metric(["only-once-without-timestamp"], random.randint(10, 20))
        yield c
        self.collect_call_counter += 1
        print(f"collect number: {self.collect_call_counter}")


class CustomRessource(Resource):
    registry = CollectorRegistry()
    registry.register(CustomCollector())

    def __init__(self):
        Resource.__init__(self)

    def render_GET(self, request):
        output = generate_latest(self.registry)

        request.setHeader("Content-Type", "text/plain; charset=UTF-8")
        request.setResponseCode(200)
        print(output.decode())
        return output


root = Resource()
root.putChild(b"metrics", CustomRessource())

factory = Site(root)
reactor.listenTCP(8000, factory)
reactor.run()

Exported data and results

For reference, this ‘snippet’ produces the following metrics when scraped (3 samples provided. The collect number: x is from logging and not part of the metrics)

collect number: 1
# HELP my_gauge No clue what the heck is going on
# TYPE my_gauge gauge
my_gauge{Type="random"} 10.0
my_gauge{Type="only-once-with-timestamp"} 18.0 1739893690370
my_gauge{Type="only-once-without-timestamp"} 17.0

collect number: 2
# HELP my_gauge No clue what the heck is going on
# TYPE my_gauge gauge
my_gauge{Type="random"} 5.0

collect number: 3
# HELP my_gauge No clue what the heck is going on
# TYPE my_gauge gauge
my_gauge{Type="random"} 4.0

In prometheus (and grafana) i get strange results:

  • I would expect the green line to be a dot (just like the blue one)
  • As you can see, I stopped the exporter and the red line (reference) stopped. But the green one is still going

Prometheus logs (debug level) do not state anything. I can provide them, if needed.

Any help is really appreciated.

Versions:

  • Prometheus: version=3.2.0, branch=HEAD, revision=bf7ec404b2a4933a81b0ffe2baba32b92d471655
  • prometheus_client (isntalled via pip): 0.21.1