MLflow Registry Audit Report — SoccerPredictAI¶

Date: 2026-04-28 Auditor: GitHub Copilot (Claude Opus 4.7) — /skill-ml-system-audit full (audit 05/12) Scope: MLflow experiments, runs, lineage, Model Registry, aliases Baseline: docs/validation/20260424/05_mlflow_registry_audit.md

Delta vs baseline¶

src/utils/mlflow_meta.py, src/models/{classification,final_train}.py, src/pipelines/register_model.py, and MLflow-related params unchanged since 2026-04-26. Baseline findings remain in force.

Confirmed configuration¶

Active experiment: matches_clf_smoke (per params.yaml: classification.experiment_name).
Registered model: soccer_clf.
Production alias: champion.
Serving load: models:/soccer_clf@champion via PredictionService in src/app/tasks/predict.py (load on worker_process_init).
Lineage tags: data.version (ETag), data.source_last_modified, data.ingested_at, data.train/test_rows, data.train_start/end, data.test_start, pipeline.git_sha, pipeline.params_hash, pipeline.dvc_exp_name, pipeline.run_kind, pipeline.stage, pipeline.scope, pipeline.variant, features.profile, model.family. Dataset logged via mlflow.log_input().
Artifacts: sklearn model, confusion matrix, calibration curves, feature importances, folds, holdout predictions, segment metrics, ECE (raw + calibrated).
Registration: idempotent — re-registering same run_id is a no-op.

Risk register (re-confirmed)¶

ID	Severity	Description	Status
ML-01	P1	All runs land in `matches_clf_smoke` — no automatic switch to a production experiment	Open
ML-02	P1	No champion/challenger comparison — newly registered model becomes `champion` unconditionally	Open
ML-03	P1	No automatic rollback on post-promotion degradation	Open
ML-04	P2	`ablation_study` runs share experiment with train_eval — no explicit separation of exploratory vs training runs	Open
ML-05	P2	`pipeline.dvc_exp_name` requires `DVC_EXP_NAME` env — missing tag when run outside DVC	Open

Summary¶

Aspect	Status
Naming conventions (experiments, runs)	✅
Nested parent/child runs	✅
Data + git + params lineage tags	✅
Registration idempotency	✅
Serving consumes by alias	✅
Champion/challenger gate	❌ (ML-02)
Rollback automation	❌ (ML-03)
Production vs smoke experiment separation in active config	❌ (ML-01)

Recommendation: ML-01 + ML-02 together cause R6 (no metric gate before champion promotion) — fix by adding a gate in register_model that compares new run metrics against current champion and aborts on regression. To verify the current champion's metrics live, run mlflow models get against the tracking server (out of scope for read-only audit).

See baseline §1–§5 for code-level detail.