Service & Infrastructure Metrics (Prometheus)¶

Status: ✅ Operational — GET /metrics endpoint live, 8 metrics exported

Metrics are collected from the FastAPI inference service via src/app/metrics.py and a _PrometheusMiddleware applied to all requests.

Available at: GET /metrics (Prometheus exposition format)

Exported metrics¶

Metric	Type	Description
`soccer_requests_total`	Counter	Total HTTP requests by endpoint and status
`soccer_request_duration_seconds`	Histogram	Request latency by endpoint (p50/p95/p99)
`soccer_errors_total`	Counter	Total 4xx/5xx errors

Metric	Type	Description
`soccer_predictions_total`	Counter	Total predictions served
`soccer_model_loaded`	Gauge	1 if model is loaded, 0 otherwise
`soccer_model_version`	Gauge (label)	Currently loaded model version

Metric	Type	Description
`soccer_celery_queue_length`	Gauge	Per-queue message count
`soccer_celery_workers_active`	Gauge	Active Celery worker count

Celery runtime status is also available via REST: - GET /monitoring/celery/queues - GET /monitoring/celery/workers

Grafana dashboards for these metrics are planned — see Dashboards. Full coverage matrix: Monitoring Status