Alerting Strategy¶

Status: 📋 Planned — AlertManager is not deployed. No active alert rules or notification channels exist today.

Alerting rules and runbooks are designed but not yet configured in Prometheus Alertmanager.

Intended alert rules¶

Condition	Intended severity
`soccer_model_loaded == 0`	P1 — model not available
Sustained `soccer_request_duration_seconds` p99 > 500ms	P2 — latency SLO breach
Elevated `soccer_errors_total` rate (5xx)	P2 — error spike
Service unavailable (`/healthcheck/` failing)	P1

Condition	Intended severity
`soccer_celery_queue_length` growing without bound	P2
`soccer_celery_workers_active == 0`	P1 — no workers processing

Alerts will always reference the relevant runbook in Runbooks.