The prometheus alert rule cannot be changed from the Pending state to the Firing state

zeng · November 14, 2021, 1:20pm

Hi Team,
I defined two alarm rules in prometheus rules. These are the CPU usage and memory usage. And set the for statement in the rules. But when my memory usage rate is greater than 80%, the alert status has been in Pending, and it has not been transferred to the Firing status for a long time. I guarantee that the current memory usage of the node has been above 80%

In prometheus global settings, “scrape_interval=15s”, “evaluation_interval=15s”, the following is the configuration of each component

prometheus-server config file：

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets: ["alertmanager.monitor.svc.cluster.local:9093"]

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - /config/rules/*.yaml

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['127.0.0.1:9090']

alert rules config file：

groups:
- name: "node_exporter"
  rules:
  - alert: "Memory usage is greater than 80%"
    expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 80
    for: 30s
    labels:
      severity: warning
      now: "{{ $value }}"
    annotations:
      description: "Server{{$labels.instance}} Memory usage is greater than 80%"

  - alert: "CPU usage is greater than 80%"
    expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{ mode="idle"}[1m])) * 100) > 80
    for: 30s
    labels:
      severity: warning
      now: "{{ $value }}"
    annotations:
      description: "Server{{$labels.instance}} CPU usage is greater than 80%"

alertmanager config file：

global:
route:
  group_by: ['instance']
  group_wait: 5s
  group_interval: 5s
  repeat_interval: 1h
  receiver: 'webhook'
receivers:
- name: 'webhook'
  webhook_configs:
  - url: 'http://prometheus-alert-center.monitor.svc.cluster.local:8080/prometheusalert?type=dd&tpl=prometheus-dd&ddurl=https://oapi.dingtalk.com/robot/send?access_token=xxxxxxx'
    send_resolved: true

The following is viewed in the prometheus ui interface

I have been wondering, when the for statement is configured in the alert rule, prometheus will not change from the Pending state to the Firing state. Ask everyone for help

roidelapluie · November 17, 2021, 7:48am

Now should be in the annotations, not in the labels.

By putting it in a label, you create a new alert every time. Labels are what defines the uniqueness of an alert. Annotations can change without creating a new alert.

zeng · November 17, 2021, 10:16am

Thank you very much for your reply. After I deleted the now in the label, the alert push was the same as I expected. The problem is solved. great

Topic		Replies	Views
Need help in writing an alerting rule General Help/Support	0	341	October 26, 2022
Promethes how to config alert rule of monitor event General Help/Support	0	421	May 31, 2022
Alert - Disable specific server/device Prometheus server	6	1055	January 10, 2022
How to find out stale alert rules Prometheus server	0	469	August 13, 2021
Prometheus alert rule doesn't honor changed FOR clause Prometheus server	5	319	November 8, 2022

The prometheus alert rule cannot be changed from the Pending state to the Firing state

Related topics