Prometheus crashes during compaction process

Hi all,

I’m new to the forum so let me first introduce myself:
My name is Pim and I’m a software enthousiast from the Netherlands. I’m using Prometheus for one of my hobby projects to measure indoor climate like air quality and temperature etc. Professionally I’m a java back-end developer mainly in the fintech industry.

I’m not sure if this post belongs here or in the general/support channel, but since it applies to Prometheus server, I opted for this location.

I’m running Prometheus in a docker container along with my Java Spring Boot app, Postgres and Grafana on a Raspberry Pi 4 with 4GB of RAM. I’m running Raspberry Pi OS (32bit). I was running Prometheus 2.25.2 and tried to downgrade to 2.24.x but to no avail.

My problem is that my prometheus server started crashing lately during the compacting process. The logs mention a failure of allocating memory and a file already being closed. For a more detailed logs snippet, see below. It seems it is requesting to allocate 128MB of memory, and I have plenty of memory I suppose (2424MB available and 642MB free). So I’m not really sure why this happens.
I have about 1 year of history. I started with a couple of sensors but it grew up to about 930 series today.

I hope to get some pointers how to improve my setup so it stops crashing.

Here is a snapshot of a Prometheus dashboard in Grafana of the past 7 days:
https://snapshot.raintank.io/dashboard/snapshot/odeWxReUZ7fHGlE0bkQkAq0Y3Nb0qYDx

Log snippet:
level=info ts=2021-04-17T19:55:23.747Z caller=compact.go:507 component=tsdb msg=“write block” mint=1618678485953 maxt=1618682400000 ulid=01F3GPDTRF3RGYPE28T7TS0J6M duration=2.003752952s
level=info ts=2021-04-17T19:55:23.795Z caller=head.go:824 component=tsdb msg=“Head GC completed” duration=17.450022ms
level=info ts=2021-04-17T19:55:24.236Z caller=compact.go:448 component=tsdb msg=“compact blocks” count=2 mint=1618649310953 maxt=1618660800000 ulid=01F3GPDWT1HM69QWR09MPZD7H0 sources="[01F3FTJDBCCMT9A9QV47A9GDB7 01F3FYNASWMA2MBPM357FCV4CV]" duration=394.930521ms
level=info ts=2021-04-17T19:55:24.255Z caller=db.go:1191 component=tsdb msg=“Deleting obsolete block” block=01F3FYNASWMA2MBPM357FCV4CV
level=info ts=2021-04-17T19:55:24.267Z caller=db.go:1191 component=tsdb msg=“Deleting obsolete block” block=01F3FTJDBCCMT9A9QV47A9GDB7
level=info ts=2021-04-17T21:00:46.229Z caller=compact.go:507 component=tsdb msg=“write block” mint=1618682421670 maxt=1618689600000 ulid=01F3GT5K128HRMKAHTS0820AZ9 duration=242.687879ms
level=info ts=2021-04-17T21:00:46.251Z caller=head.go:824 component=tsdb msg=“Head GC completed” duration=7.452444ms
level=error ts=2021-04-17T21:00:46.284Z caller=db.go:745 component=tsdb msg=“compaction failed” err="compact head: head memory truncate: truncate chunks.HeadReadWriter: mmap, size 134217728: cannot allocate memory"
level=info ts=2021-04-17T21:01:48.703Z caller=compact.go:448 component=tsdb msg=“compact blocks” count=2 mint=1618660800000 maxt=1618682400000 ulid=01F3GT7FWZN8T2AQW3A1AF7FX8 sources="[01F3GEGTAWMYRXNPXATGQJVFEP 01F3GPDTRF3RGYPE28T7TS0J6M]" duration=384.301167ms
level=info ts=2021-04-17T21:01:48.721Z caller=db.go:1191 component=tsdb msg=“Deleting obsolete block” block=01F3GPDTRF3RGYPE28T7TS0J6M
level=info ts=2021-04-17T21:01:48.734Z caller=db.go:1191 component=tsdb msg=“Deleting obsolete block” block=01F3GEGTAWMYRXNPXATGQJVFEP
level=info ts=2021-04-17T23:00:46.237Z caller=compact.go:507 component=tsdb msg=“write block” mint=1618689621670 maxt=1618696800000 ulid=01F3H11A8Z5CV95JQPBSNSHKPG duration=253.839075ms
level=info ts=2021-04-17T23:00:46.259Z caller=head.go:824 component=tsdb msg=“Head GC completed” duration=7.374657ms
level=error ts=2021-04-17T23:00:46.259Z caller=db.go:745 component=tsdb msg=“compaction failed” err=“compact head: head memory truncate: truncate chunks.HeadReadWriter: write /prometheus/chunks_head/001838: file already closed”
panic: write /prometheus/chunks_head/001838: file already closed

goroutine 428 [running]:
github_com/prometheus/prometheus/tsdb.(*memSeries).mmapCurrentHeadChunk(0x5ae9600, 0x5356240)
/app/tsdb/head.go:2046 +0x228
github_com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk(0x5ae9600, 0xe24738a6, 0x178, 0x5356240, 0x0)
/app/tsdb/head.go:2017 +0x24
github_com/prometheus/prometheus/tsdb.(*memSeries).append(0x5ae9600, 0xe24738a6, 0x178, 0xba5e353f, 0x40d56ee1, 0x2dc, 0x0, 0x5356240, 0x17ebcac)
/app/tsdb/head.go:2173 +0x3a0
github_com/prometheus/prometheus/tsdb.(*headAppender).Commit(0x5c216e0, 0x0, 0x0)
/app/tsdb/head.go:1265 +0x208
github_com/prometheus/prometheus/tsdb.dbAppender.Commit(0x257b918, 0x5c216e0, 0x5174480, 0x0, 0x67f1c4da)
/app/tsdb/db.go:773 +0x24
github_com/prometheus/prometheus/storage.(*fanoutAppender).Commit(0x6114fc0, 0x4b71555, 0x0)
/app/storage/fanout.go:174 +0x28
github_com/prometheus/prometheus/scrape.(*scrapeLoop).scrapeAndReport.func1(0x5ca9d90, 0x6641d98, 0x5262180)
/app/scrape/scrape.go:1086 +0x38
github_com/prometheus/prometheus/scrape.(*scrapeLoop).scrapeAndReport(0x5262180, 0xf8475800, 0xd, 0x540be400, 0x2, 0x67f6334f, 0xc016fb76, 0xb6ca72d7, 0x13e8, 0x33ecf78, …)
/app/scrape/scrape.go:1153 +0x748
github_com/prometheus/prometheus/scrape.(*scrapeLoop).run(0x5262180, 0xf8475800, 0xd, 0x540be400, 0x2, 0x0)
/app/scrape/scrape.go:1039 +0x29c
created by github_com/prometheus/prometheus/scrape.(*scrapePool).sync
/app/scrape/scrape.go:510 +0x728

The problem is likely the 32-bit OS.

Prometheus uses mmap and virtual memory to manage access to the stored data. This means it needs the ability to map virtual pages to the data on disk. This means there is only 4GiB available for both running memory, and storage. But, there is a kernel / user split, so even if you have 4GiB of virtual memory, only some of it is available to an individual process.

The default Raspberry Pi kernel settings makes this worse. They change the default kernel/userspace reservation from the default of 1GiB/3GiB to 2GiB/2GiB. This means you can only have 2GiB of memory used for both RSS and VSS for any single process.

I would recommend using a 64-bit Raspberry Pi distro.

I forgot to mention, the only workaround to get out of the crash loop is to delete some of the older TSDB data directories. This will reduce the virtual memory needed so it can recover.

Thanks a lot SuperQ. I was fearing such already but before I start migrating to a 64bit OS I wanted to confirm (more or less) that it would be worth the effort. The fact it is 32 bit was already frustrating me for quite some time for other reasons too.

One thing I still don’t understand though, is that a 32bit OS is limited to 4GB memory, which is the fysical limit of my Pi model. So that would mean that the main benefit/gain would be from the OS / kernel settings so that instead of 2/2 it gets 1/3 and basically gains one additional GB of mem virtually?

Also, mmap, size 134217728 isn’t this in bytes? That is 128MB so that should be nowhere near the limits?

Now I’m writing this, could it be the memory limit of docker itself? Of course, that is outside the scope of this support forum :smiley: I suppose I will have to experiment on my Raspberry pi 400 board with a 64bit OS and/or docker :slight_smile:

I will try to update the post once I have some results.

If you can query for it, try this query:

process_virtual_memory_bytes / (process_virtual_memory_max_bytes > 0)

That should give you an idea if you’ve got a problem or not.

Sadly process_virtual_memory_max_bytes yields -1

Hrm, I wonder what’s causing that. I wonder if we have a bug in client_golang.

Either way, just looking at process_virtual_memory_bytes should tell you what the Prometheus process thinks it’s using.

Bingo, 2697420800 bytes. In other words 2.5GB…

That makes more sense than the 128MB from the console :upside_down_face:

I suspect it’s -1 due to how it’s container is handled. I’m running it with docker-compose. I tried to set the memory reservation and limits in the docker-compose.yml file, but then I’ll have to deploy/run it with docker swarm. I still have to figure out how to do that. I’m not that experienced with docker.

BTW. I upgraded from 2.24.1 back to 2.25.2 and process_virtual_memory_max_bytes is still -1.

Actually, the -1 is due to a bug in reading ulimits.

This is fixed by updating any binaries to the latest client_golang. The latest Prometheus version should have this fix.

Oh, you know, I think that patch is also buggy, as it expresses the “unlimited” as always 64-bit, even on a 32-bit system. :disappointed:

Ah, upgraded to 2.26.0 and now it returns 18446744073709552000 for process_virtual_memory_max_bytes. That’s an awfull lot…

So your original query (process_virtual_memory_bytes / (process_virtual_memory_max_bytes > 0)) now returns 0.0000000001391018411567302

I have no clue if that’s good or bad, what are your thoughts?

BTW. didn’t -1 and this large number now simply represent ‘unlimited’?

Yea, it turns out that the new value is also bogus, as it doesn’t take into account architectures. :frowning:

But yes, the value represents “unlimited” from a ulimit perspective. But there are still limits based on architecture. I’ve been researching ways to handle this, but there doesn’t seem to be any reliable way to ask the system what the maximum possible VSS is, to even get an estimate.

If I’m reading this correctly, the value for 64-bit should actually be lower. Only 48-bits.

Thankfully:

This is still 65,536 times larger than the virtual 4 GB address space of 32-bit machines.
:sweat_smile:

Thanks for the link, today I learned

It’s been a while but it took more time than I anticipated but I’ve finally swapped my SD card for another one with Ubuntu Server (20.04 LTS), migrated my docker volumes and spun those containers up again.

I can happily report that Prometheus is stable for 24+ hours now! :smiley:
NB. I didn’t loose any data or history, it’s all preserved.

So pretty safe to say 32bit (or at least rasbperry pi os) was the culprit here.
NB2: I’m using another SD card now but of same brand/type (Samsung Evo Plus 64GB)

1 Like

Awesome to read!!

Do we need a github issue to track this?

@roidelapluie Yes, here: Limits returns 64-bit max for VSS on all platforms · Issue #379 · prometheus/procfs · GitHub

So far, there’s no conclusion on what to do. All of the options are not great. We either provide no information, +Inf which is misleading, or bogus information.