Model Registry¶
Status: 🚧 Partial (Registration automated via DVC; Staging→Production gate is manual)
Role of the registry¶
The MLflow Model Registry serves as the single source of truth for deployable models.
Current Implementation¶
✅ MLflow Tracking¶
All training runs are logged to MLflow with: - Hyperparameters - Metrics (accuracy, precision, recall, F1) - Artifacts (models, plots, confusion matrices) - Code version (git commit)
Evidence:
✅ Automated Registration via DVC¶
Model registration is the final stage of the DVC pipeline (dvc.yaml):
register_model:
cmd: python -m src.pipelines cli-register-model data/models/run_id.json
deps:
- data/models/run_id.json
- src/pipelines/register_model.py
src/pipelines/register_model.py performs:
1. Creates the registered model if it doesn’t exist (idempotent).
2. Creates a new model version from the training run.
3. Transitions the version to settings.mlflow.model_stage (default: Staging).
Re-running dvc repro with the same run_id is safe (idempotent by design).
🚧 Manual: Staging → Production Promotion¶
Promoting a model from Staging to Production requires manual approval.
This is a deliberate quality gate — not a missing feature.
Future workflow (Phase 5):
1. Validation metrics compared against thresholds.
2. If better than current production model → auto-promote to Staging.
3. After canary test period → Production.
4. Old model → Archived.
Model Lifecycle¶
| Stage | Meaning |
|---|---|
None |
Freshly registered from training run |
Staging |
Registered by register_model DVC stage; ready for testing |
Production |
Currently served by API workers |
Archived |
Deprecated; kept for reproducibility |
Deployment Coupling¶
PredictionService (in src/app/services/predict.py) loads the model via:
Stage is read from settings.mlflow.model_stage —
changing the stage in the registry switches the served model without redeployment.
Rollback¶
Rollback: transition a previous version to Production in MLflow UI or API.
No retraining required.