The phenomenon：The alert_rules of prometheus work well, and the status of the alert can be “firing” all the time, but alertmananger can not receive the alerts. And this happens occasionally, so some alerts will be missed.
The prometheus alert config:
groups: - name: server_upgrade_succeed_rule_group rules: - alert: server_upgrade_succeed_rule_alert expr: SERVICE_UPGRADE_SUCCESS == 1 for: 0s annotations: level: "info" category: "app_event" rule_id: "server_upgrade_succeed_rule" component_id: "" summary: "server upgrade successfully"
To be more detailed, the metric is like this
i guess the “expr” is wrong, since the metric is just a dot, so maybe the promsql can not get any data on one second. Hope someone can help me to figure out it