I have a source of data in the form of log lines. These logs have values like latency (however that’s defined).
If I want to have the max latency between scrapes, I would need a tool that:
a) follows the files input as responsive as possible;
b) parses all new lines, get the values I want;
c) keep only the max so far;
d) listen for scrapes and return maxes so far; and
e) reset those maxes to 0 for the next scraping period, making sure no data point is lost.
Is there such a tool, or do I need to write my own? mtail’s language does not seem to be enough and the grok exporter is definitely less suitable.
TBH, I would have expected such a value at least to come in a histogram type series.
If you had the Maximum value recorded for each 5 second interval over a 5000 second run, and want to deduce the Maximum value for the whole run, would an average of the 1,000 max values produce anything useful? No. You’d need to use the Maximum of the Maximums. Which will actually work, but only for the unique edge case of 100%…
But if you had the Minimum value recorded for each 5 second interval over a 5000 second run, and want to deduce the Minimum value for the whole run, you’d have to look for the Minimum of the Minimums. Which also works, but only for the unique edge case of 0%.
Gil Tene (it’s always him) is talking about how you can’t average percentiles, but while discussing it, he drops those two gems, which actually are the answers I’m looking for, and even gives me the idea that it wouldn’t hurt to monitor the minimum too.