Skip to content

MLflow Registry Audit Report — SoccerPredictAI

Date: 2026-04-28 Auditor: GitHub Copilot (Claude Opus 4.7) — /skill-ml-system-audit full (audit 05/12) Scope: MLflow experiments, runs, lineage, Model Registry, aliases Baseline: docs/validation/20260424/05_mlflow_registry_audit.md


Delta vs baseline

src/utils/mlflow_meta.py, src/models/{classification,final_train}.py, src/pipelines/register_model.py, and MLflow-related params unchanged since 2026-04-26. Baseline findings remain in force.


Confirmed configuration

  • Active experiment: matches_clf_smoke (per params.yaml: classification.experiment_name).
  • Registered model: soccer_clf.
  • Production alias: champion.
  • Serving load: models:/soccer_clf@champion via PredictionService in src/app/tasks/predict.py (load on worker_process_init).
  • Lineage tags: data.version (ETag), data.source_last_modified, data.ingested_at, data.train/test_rows, data.train_start/end, data.test_start, pipeline.git_sha, pipeline.params_hash, pipeline.dvc_exp_name, pipeline.run_kind, pipeline.stage, pipeline.scope, pipeline.variant, features.profile, model.family. Dataset logged via mlflow.log_input().
  • Artifacts: sklearn model, confusion matrix, calibration curves, feature importances, folds, holdout predictions, segment metrics, ECE (raw + calibrated).
  • Registration: idempotent — re-registering same run_id is a no-op.

Risk register (re-confirmed)

ID Severity Description Status
ML-01 P1 All runs land in matches_clf_smoke — no automatic switch to a production experiment Open
ML-02 P1 No champion/challenger comparison — newly registered model becomes champion unconditionally Open
ML-03 P1 No automatic rollback on post-promotion degradation Open
ML-04 P2 ablation_study runs share experiment with train_eval — no explicit separation of exploratory vs training runs Open
ML-05 P2 pipeline.dvc_exp_name requires DVC_EXP_NAME env — missing tag when run outside DVC Open

Summary

Aspect Status
Naming conventions (experiments, runs)
Nested parent/child runs
Data + git + params lineage tags
Registration idempotency
Serving consumes by alias
Champion/challenger gate ❌ (ML-02)
Rollback automation ❌ (ML-03)
Production vs smoke experiment separation in active config ❌ (ML-01)

Recommendation: ML-01 + ML-02 together cause R6 (no metric gate before champion promotion) — fix by adding a gate in register_model that compares new run metrics against current champion and aborts on regression. To verify the current champion's metrics live, run mlflow models get against the tracking server (out of scope for read-only audit).

See baseline §1–§5 for code-level detail.