Demo Guide — Interview Walkthrough¶
This page is an interview-support resource only. It is not a source of truth for implementation readiness.
For what is built and what is not, see Implementation Status. For system design, see Architecture Overview.
Use this page to prepare a walkthrough. Do not make claims here that exceed what status.md states.
30-second summary¶
"SoccerPredictAI is an end-to-end MLOps system for football match prediction. It covers the full lifecycle: Airflow scrapes data into PostgreSQL, DVC versions datasets and orchestrates a reproducible training pipeline, MLflow tracks experiments and manages the model registry, and a FastAPI service with Celery async workers serves predictions — with Prometheus metrics and health endpoints. The whole thing runs on Kubernetes with GitLab CI."
2-minute walkthrough¶
Goal: show the person this is a real system, not a notebook.
- Open Implementation Status.
- Point to the matrix: "Here's exactly what's built and what isn't — I don't oversell it."
- Open System diagram.
- Walk through left to right: "Data arrives here, gets versioned here, model trained here, served here."
- One live proof: open the MLflow UI or show a terminal with
dvc dag.
Key message: "I designed this as a system, not a collection of scripts."
5-minute walkthrough¶
Goal: show engineering depth.
- Reproducibility (1 min)
- Show
dvc dagoutput or Training Pipeline. -
"Any clean checkout of this repo +
dvc pull+dvc reprogives the same model." -
Validation rigor (1 min)
- Open Validation Strategy.
-
"Random CV is wrong for time-series. I use temporal split: train on past, evaluate on future. And I test for leakage with property tests using
hypothesis." -
Serving (1 min)
- Open Serving Status or make a live API call:
-
"Sync path routes through the Celery
mlqueue with a 30 s timeout. Async path uses Celery, returns a task_id for polling." -
Monitoring (1 min)
- Show
GET /metricsresponse. -
"Prometheus scrapes this. Grafana dashboards are the next step — runbooks are already written."
-
Trade-off (1 min)
- "Why DVC over just saving files? Because every model is traceable to an exact data version and code commit."
10-minute technical deep-dive¶
Goal: show system thinking, design decisions, and honest limitations.
Step 1 — Architecture layers (2 min)¶
Open Architecture Overview.
Walk through the layers:
- src/data/ — data access only, no business logic
- src/features/ — pure functions, no IO
- src/models/ — model code, no IO
- src/pipelines/ — orchestration, no logic
- src/app/ — serving, no training
"Every cross-layer shortcut is a bug waiting to happen. The architecture enforces this."
Step 2 — ML problem and baseline (2 min)¶
Open Problem Formulation.
- "3-class classification: home win / draw / away win."
- "The naive baseline is bookmaker implied probabilities — that's the hardest baseline to beat."
- See Baseline & Success Metrics for the acceptance threshold.
Step 3 — One architectural decision (2 min)¶
Open Architecture Trade-offs or pick an ADR.
Good choices: - Why DVC over LakeFS? → Git-native, zero extra infrastructure, works with MinIO. - Why Celery + RabbitMQ over Kafka? → Fits current scale; Kafka adds ops overhead we don't need yet. - Why MLflow over W&B? → Self-hosted, no SaaS dependency, integrates with DVC artifacts.
Step 4 — Tests (1 min)¶
"About 200 tests: unit, property-based (Hypothesis), service, contract, and load (Locust). The no-leakage invariant for rolling features is a property test — it runs on random data."
Step 5 — Known limits (1 min)¶
Open Limitations or Lessons Learned.
- "Single data source — no player-level or injury data yet."
- "Feature store is file-based parquet; Redis migration is planned."
- "Grafana dashboards not yet deployed; Prometheus is exporting data."
"I know what's missing. Here's what I'd build next and why."
Step 6 — Questions (2 min buffer)¶
See Common Questions below.
Click path (for screen share)¶
| Step | URL / Command |
|---|---|
| 1. Status | docs/status.md or live site |
| 2. System diagram | docs/index.md |
| 3. DVC DAG | dvc dag in terminal |
| 4. MLflow UI | mlflow ui --port 5001 |
| 5. API call | curl http://api.time2bet.ru/v1/predict ... |
| 6. Health check | curl http://api.time2bet.ru/healthcheck/ |
| 7. Metrics | curl http://api.time2bet.ru/metrics |
| 8. Tests | pytest tests/ --tb=short -q |
Common questions and good answers¶
Q: Why XGBoost and not a deep learning model?
"Tabular data with ~20 features and a few thousand rows per season. Tree-based ensembles consistently outperform neural nets at this scale. The architecture supports swapping models — the model signature contract is how inference is decoupled from the training framework."
Q: How do you prevent data leakage?
"Three mechanisms: temporal split by design, join predicates keyed on
match_date < split_date, and property tests withhypothesisthat verify rolling features don't see future data. Leakage is treated as a critical bug in the test suite."
Q: Why not use a feature store?
"It's a deliberate current-state trade-off. Features are computed offline and stored as Parquet. The offline/online parity contract is enforced by tests. A Redis-backed feature store is the next planned improvement — see Limitations."
Q: Is this deployed to production?
"It's deployed to a VPS-hosted Kubernetes cluster. The API is live at
api.time2bet.ru. Grafana and Evidently are designed but not yet integrated — I don't hide that. Status page is explicit about it."
Q: How would you scale this?
"Celery horizontal scaling: add workers. For data volume: partition by league/season in PostgreSQL. For features: move from Parquet to a proper feature store. For serving: HPA on K8s is already configured."
What is real vs planned¶
For the authoritative implementation matrix, see Implementation Status.
Do not make claims in this walkthrough that exceed what status.md states.