Train ↔ Serve Consistency Audit Report — SoccerPredictAI¶
Date: 2026-04-28
Auditor: GitHub Copilot (Claude Opus 4.7) — /skill-ml-system-audit full (audit 06/12)
Scope: Skew between training, batch inference, and online serving feature paths
Baseline: docs/validation/20260424/06_train_serve_consistency_audit.md
Delta vs baseline¶
src/features/, src/pipelines/{features,inference}.py, src/app/services/predict.py, src/app/tasks/predict.py, src/app/routers/predict.py unchanged since 2026-04-26. Baseline findings remain in force.
Confirmed paths¶
| Path | Feature source | Computation |
|---|---|---|
Training (final_train) |
features.parquet + features_meta.parquet |
offline DVC feature_engineering |
Batch inference (batch_inference) |
re-computed via compute_all_match_features() |
same code as training (build_team_match_table, add_rolling_features, to_match_level, compute_elo_ratings, select_model_features) |
Online (POST /predict/) |
request body features: dict |
client-supplied; no server-side recomputation |
All preprocessing (StandardScaler, SimpleImputer, OneHotEncoder) is encapsulated in the serialized sklearn Pipeline → loaded via mlflow.pyfunc.load_model("models:/soccer_clf@champion"). No manual preprocessing at serving time.
Risk register (re-confirmed)¶
| ID | Severity | Description | Status |
|---|---|---|---|
| TS-01 | P1 | POST /predict/ has no server-side feature computation or schema validation against model input |
Open |
| TS-02 | P1 | No model hot-reload in Celery worker on champion change → manual worker restart required |
Open (= R3) |
| TS-03 | P2 | Batch-inference features fresher than training features (use more-recent history) — small systematic coverage skew | Open |
| TS-04 | P2 | FeatureLookupService.get_features() strips NaNs; SimpleImputer fills the gap silently |
Open |
| TS-05 | P2 | No staleness check on match_features.parquet (no max-age guard) — stale predictions without alerting |
Open (= R5) |
Summary¶
| Aspect | Status |
|---|---|
| Training ↔ batch inference parity (code, params, selector) | ✅ |
| Preprocessing baked into serialized model | ✅ |
Feature order via features_meta.parquet |
✅ |
| Server-side feature recomputation for online predict | ❌ (TS-01) |
| Hot-reload of registered model | ❌ (TS-02) |
| Staleness guard on batch features | ❌ (TS-05) |
Recommendation: TS-02 and TS-05 are top operational risks. Minimal viable fixes: (a) periodic alias check + lazy reload in worker_process_init lifecycle (e.g. SIGUSR1 → re-load); (b) last_modified SLO on match_features.parquet with HTTP 503 on staleness.
See baseline §1–§5 for code-level detail.