Skip to content

Service & Infrastructure Metrics (Prometheus)

Status: ✅ Operational — GET /metrics endpoint live, 8 metrics exported

Metrics are collected from the FastAPI inference service via src/app/metrics.py and a _PrometheusMiddleware applied to all requests.

Available at: GET /metrics (Prometheus exposition format)


Exported metrics

API metrics (✅ live)

Metric Type Description
soccer_requests_total Counter Total HTTP requests by endpoint and status
soccer_request_duration_seconds Histogram Request latency by endpoint (p50/p95/p99)
soccer_errors_total Counter Total 4xx/5xx errors

Prediction metrics (✅ live)

Metric Type Description
soccer_predictions_total Counter Total predictions served
soccer_model_loaded Gauge 1 if model is loaded, 0 otherwise
soccer_model_version Gauge (label) Currently loaded model version

Celery metrics (✅ live)

Metric Type Description
soccer_celery_queue_length Gauge Per-queue message count
soccer_celery_workers_active Gauge Active Celery worker count

Celery runtime status is also available via REST: - GET /monitoring/celery/queues - GET /monitoring/celery/workers


Not yet implemented

  • RabbitMQ queue metrics via dedicated exporter
  • Kubernetes CPU / memory / pod restarts
  • PostgreSQL query latency via pg_exporter
  • Log aggregation (stdout only today)

Dashboards

Grafana dashboards for these metrics are planned — see Dashboards. Full coverage matrix: Monitoring Status