Dashboards & Visualisation (Grafana)¶
Purpose of dashboards¶
Dashboards provide situational awareness and reduce mean time to diagnosis (MTTD).
Core dashboards¶
Service Overview¶
- request rate,
- latency percentiles,
- error rate,
- active model version.
Async Processing¶
- queue depth,
- worker throughput,
- retry behavior,
- backlog trends.
Infrastructure¶
- CPU / memory usage,
- pod health,
- scaling events.
Dashboard design principles¶
- focus on trends, not single points,
- align panels with SLOs,
- avoid excessive cardinality,
- annotate deployments and model changes.
Model-aware dashboards¶
Dashboards explicitly display: - active model version, - deployment timestamp, - recent promotions or rollbacks.
This enables correlation between model changes and system behavior.