The prometheus alert rule cannot be changed from the Pending state to the Firing state

Hi Team,
I defined two alarm rules in prometheus rules. These are the CPU usage and memory usage. And set the for statement in the rules. But when my memory usage rate is greater than 80%, the alert status has been in Pending, and it has not been transferred to the Firing status for a long time. I guarantee that the current memory usage of the node has been above 80%

In prometheus global settings, “scrape_interval=15s”, “evaluation_interval=15s”, the following is the configuration of each component

prometheus-server config file:

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets: ["alertmanager.monitor.svc.cluster.local:9093"]

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - /config/rules/*.yaml

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['127.0.0.1:9090']

alert rules config file:

groups:
- name: "node_exporter"
  rules:
  - alert: "Memory usage is greater than 80%"
    expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 80
    for: 30s
    labels:
      severity: warning
      now: "{{ $value }}"
    annotations:
      description: "Server{{$labels.instance}} Memory usage is greater than 80%"

  - alert: "CPU usage is greater than 80%"
    expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{ mode="idle"}[1m])) * 100) > 80
    for: 30s
    labels:
      severity: warning
      now: "{{ $value }}"
    annotations:
      description: "Server{{$labels.instance}} CPU usage is greater than 80%"

alertmanager config file:

global:
route:
  group_by: ['instance']
  group_wait: 5s
  group_interval: 5s
  repeat_interval: 1h
  receiver: 'webhook'
receivers:
- name: 'webhook'
  webhook_configs:
  - url: 'http://prometheus-alert-center.monitor.svc.cluster.local:8080/prometheusalert?type=dd&tpl=prometheus-dd&ddurl=https://oapi.dingtalk.com/robot/send?access_token=xxxxxxx'
    send_resolved: true

The following is viewed in the prometheus ui interface

I have been wondering, when the for statement is configured in the alert rule, prometheus will not change from the Pending state to the Firing state. Ask everyone for help

Now should be in the annotations, not in the labels.

By putting it in a label, you create a new alert every time. Labels are what defines the uniqueness of an alert. Annotations can change without creating a new alert.

Thank you very much for your reply. After I deleted the now in the label, the alert push was the same as I expected. The problem is solved. great

1 Like