Skip to content

Dashboards & Visualisation (Grafana)

Purpose of dashboards

Dashboards provide situational awareness and reduce mean time to diagnosis (MTTD).


Core dashboards

Service Overview

  • request rate,
  • latency percentiles,
  • error rate,
  • active model version.

Async Processing

  • queue depth,
  • worker throughput,
  • retry behavior,
  • backlog trends.

Infrastructure

  • CPU / memory usage,
  • pod health,
  • scaling events.

Dashboard design principles

  • focus on trends, not single points,
  • align panels with SLOs,
  • avoid excessive cardinality,
  • annotate deployments and model changes.

Model-aware dashboards

Dashboards explicitly display: - active model version, - deployment timestamp, - recent promotions or rollbacks.

This enables correlation between model changes and system behavior.