Using helm charts for raspberry pi cluster runs into memory issues

Hello, I ended up here after reaching the github issue #8661.

I am trying to set up Prometheus and Grafana via HelmRepositories with Fluxdb.

My prometheus-server pods crash and report issues with memory:

panic: mmap, size 134217728: cannot allocate memory

goroutine 482 [running]:*memSeries).mmapCurrentHeadChunk(0x71b4630, 0x487dd40)
	/app/tsdb/head.go:2233 +0x22c*memSeries).cutNewHeadChunk(0x71b4630, 0x48323ba7, 0x17a, 0x487dd40, 0x1)
	/app/tsdb/head.go:2204 +0x24*memSeries).append(0x71b4630, 0x48323ba7, 0x17a, 0x0, 0x40f056a0, 0x0, 0x0, 0x487dd40, 0x2620101)
	/app/tsdb/head.go:2360 +0x3a8*Head).processWALSamples(0x3d1dc20, 0x47c30100, 0x17a, 0x8e2acc0, 0x8e2ac80, 0x0, 0x0)
	/app/tsdb/head.go:425 +0x270*Head).loadWAL.func5(0x3d1dc20, 0x8402360, 0x8402370, 0x8e2acc0, 0x8e2ac80)
	/app/tsdb/head.go:519 +0x40
created by*Head).loadWAL
	/app/tsdb/head.go:518 +0x268
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0xc pc=0x17127f8]

goroutine 479 [running]:
	/usr/local/go/src/bufio/bufio.go:624*ChunkDiskMapper).WriteChunk(0x487dd40, 0x2f50, 0x0, 0x47c63366, 0x17a, 0x483066e6, 0x17a, 0x26914c4, 0x285e4660, 0x0, ...)
	/app/tsdb/chunks/head_chunks.go:291 +0x54c*memSeries).mmapCurrentHeadChunk(0x71b4790, 0x487dd40)
	/app/tsdb/head.go:2230 +0x6c*memSeries).cutNewHeadChunk(0x71b4790, 0x48323ba7, 0x17a, 0x487dd40, 0x1)
	/app/tsdb/head.go:2204 +0x24*memSeries).append(0x71b4790, 0x48323ba7, 0x17a, 0x0, 0x40f05a60, 0x0, 0x0, 0x487dd40, 0x10101)
	/app/tsdb/head.go:2360 +0x3a8*Head).processWALSamples(0x3d1dc20, 0x47c30100, 0x17a, 0x8e2aa40, 0x8e2a980, 0x0, 0x0)
	/app/tsdb/head.go:425 +0x270*Head).loadWAL.func5(0x3d1dc20, 0x8402360, 0x8402370, 0x8e2aa40, 0x8e2a980)
	/app/tsdb/head.go:519 +0x40
created by*Head).loadWAL
	/app/tsdb/head.go:518 +0x268

This warning also caught my attention, not sure if getting an arm64 image would solve my problem but I didn’t figure out how to set the values to get the right image from the charts:

level=warn ts=2021-07-22T19:10:46.865Z caller=main.go:420 msg="This Prometheus binary has not been compiled for a 64-bit architecture. Due to virtual memory constraints of 32-bit systems, it is highly recommended to switch to a 64-bit binary of Prometheus." GOARCH=arm

I was wrong about the arch stuff on the Pi 4, it’s actually armv7l, so the warning is correct and fixing that is not on topic (AFAIK).

Yes, Prometheus uses memory mapping (mmap) to access TSDB pages. On Raspberry Pi, the kernel is 32-bits, but they compile it such that only half of the virtual address space used for this is available to each process. This means that the TSDB mmap code will OOM on virtual memory at 2GB of data.

The only way to solve this is to either reduce the local retention to decrease the TSDB size on disk, or to switch to a 64-bit kernel version for the Pi3/Pi4.