Rule kinds
scheduled
Periodic SQL evaluation. The rule’s query runs on an interval; the trigger condition decides
whether to fire.
realtime
In-ingest predicate match. Evaluated as data lands, so it fires in under a second — faster
than any scrape/eval-interval approach.
anomaly
Baseline via MAD (median absolute deviation) plus a 3σ band. Fires when a series deviates from
its learned baseline.
Escalation policies
An escalation policy is a multi-step plan attached to a rule. Thealert_manager role drives it:
- On each tick, it scans open incidents.
- If the current step hasn’t been dispatched, it resolves the step’s targets and on-call, then sends through the matching channel.
- If no acknowledgement arrives before
ack_timeout, it advances to the next step and dispatches again. - This repeats until the incident is acknowledged or resolved, or all steps are exhausted.
Channels
Notifications go out over Slack, webhook, and email, with PagerDuty-style escalation and template variables. Incidents carrystep, acknowledged_at, and resolved_at, mirroring a
PagerDuty incident lifecycle.
Managing alerts via the API
| Endpoint | Purpose |
|---|---|
POST /api/v1/alerts/rules | Create a rule |
GET / PATCH /api/v1/alerts/rules/{id} | Read / update a rule |
POST /api/v1/alerts/escalations | Create an escalation policy |
POST /api/v1/alerts/channels | Create a notification channel |
GET /api/v1/alerts/incidents | List incidents |
POST /api/v1/alerts/incidents/{id}/ack | Acknowledge an incident |
POST /api/v1/alerts/incidents/{id}/resolve | Resolve an incident |
Alerting logic is complete and needs production hours. If a real-time rule misses an event or an
escalation behaves unexpectedly, please file an issue with the rule definition.