Skip to content

Test Coverage Implementation Plan — 2026-04-28 (v2)

Cycle: Test strategy execution — re-run after 20260428_test.md. Target window: 2 weeks (Phase A: 1 day · B–D: ~1 week · E–F: ongoing). Source of truth: Re-inventory by skill-test-coverage-audit on 2026-04-28. Related docs: - docs/cicd/testing-strategy.md - docs/status.md - docs/quickstart.md - tests/contract/test_pipeline_contracts.py - .github/instructions/tests.instructions.md

Current state (verified pytest --collect-only -q on 2026-04-28):

274 tests collected, 5 errors in 35.67s

docs/status.md and docs/quickstart.md claim "~200 tests" → stale.

New since v1: root cause of tests/test_api.py and tests/service/test_prediction_service.py collection errors identified — production import bug in src/app/config/mlflow.py (line 4): from data.params import load_params is missing the src. prefix → ModuleNotFoundError. This is not a test bug — it is a production-code bug masked by the test failure.


Gap matrix

Inventory: find src airflow/dags -name "*.py" | grep -v __pycache__ | wc -l94 files (including __init__.py). Non-init production modules: 70.

Legend: ✅ covered · ⚠ partial / broken collection · ❌ no tests · 🚧 placeholder · N/A __init__.py/config.

src/data/

File Tests Status Notes
params.py tests/test_params.py unit
preprocess.py tests/unit/test_preprocess.py unit
splitting.py tests/unit/test_splitting.py + property/test_splitting_property.py unit + property
source.py P2
storage.py P2

src/data_quality/

File Tests Status
raw.py tests/test_data_quality.py
interim.py same
features.py same
finished.py ❌ P2
future.py ❌ P2

src/features/

File Tests Status Notes
elo.py tests/unit/test_elo.py + property
stats_matches.py tests/test_features.py, property/test_features_property.py, unit/test_h2h.py ⚠, unit/test_rest_days.py 2 tests broken (missing symbols)
select.py P2

src/models/

File Tests Status Notes
metrics.py tests/test_metrics.py, test_metrics_ece.py, test_segment_metrics.py, property/test_metrics_property.py
classification.py tests/unit/test_classification_selection.py ⚠, unit/test_learning_curve_frac.py selection test broken
pipelines.py indirectly via test_final_train.py no direct unit
final_train.py tests/unit/test_final_train.py
tuning.py P1

src/pipelines/ (orchestrators)

Covered only by tests/contract/test_pipeline_contracts.py (asserts stage existence + I/O contracts in dvc.yaml). No invocation tests for any pipeline entrypoint. | File | Status | |---|---| | __main__.py, cli.py, source.py, preprocess.py, features.py, validate_raw.py, validate_features.py, validate_finished.py, validate_future.py, validation.py, classification.py, tune.py, final_train.py, register_model.py, inference.py, ablation.py | 🚧 contract-only |

src/app/routers/

File Tests Status
predict.py tests/test_api.py ⚠ (collection error)
healthcheck.py ❌ P1
livescores.py ❌ P1
sources.py ❌ P1
stats.py ❌ P1
monitoring.py ❌ P1

src/app/services/ tasks/ data/ connections/ validation/ scraper/

File Tests Status
services/predict.py tests/service/test_prediction_service.py
tasks/predict.py tests/service/test_tasks.py ✅ partial
tasks/livescores.py ❌ P2
tasks/export.py ❌ P2
data/storage.py ❌ P2
connections/broker.py ❌ P2
validation/livescores.py ❌ P2
scraper/driver.py ❌ P2
main.py, worker*.py, database.py, dependencies.py, metrics.py ❌ P2/P3

src/app/schemas/ src/app/config/

File Tests Status
schemas/{healthcheck,predict}.py tests/unit/test_schemas.py
schemas/{models,validate}.py ❌ P2
config/mlflow.py ❌ + carries production import bug
config/{base,database,scraper,security,storage,validate,validate_bets,gunicorn}.py ❌ P3 (config)

src/ui/app/

File Tests Status
api_client.py ❌ P1
main.py, disclaimer.py, pages/*.py ❌ P1 (smoke via AppTest)

airflow/dags/

File Tests Status
etl_export_01.py, etl_livescores_{01,02,03,04}.py ❌ P1 (DagBag)

src/utils/, src/make/

File Tests Status
utils/mlflow_meta.py tests/unit/test_mlflow_meta.py
make/gen_secrets.py, make/merge_requirements.py N/A (build scripts)

Aggregate

Layer Modules
src/data/ 5 3 0 2
src/data_quality/ 5 3 0 2
src/features/ 3 1 1 1
src/models/ 5 2 2 1
src/pipelines/ 16 0 16 (contract-only) 0
src/app/routers/ 6 0 1 5
src/app/{services,tasks,data,connections,validation,scraper,...} ~14 1 1 12
src/app/schemas/ 4 2 0 2
src/app/config/ 9 0 0 9
src/ui/app/ 5 0 0 5
airflow/dags/ 5 0 0 5
src/utils/ 1 1 0 0

Collection errors (P0)

Test file Symbol(s) missing Source module Root cause Action
tests/unit/test_h2h.py add_h2h_features src/features/stats_matches.py Symbol never added or removed restore / rewrite / delete
tests/unit/test_rest_days.py add_rest_days, rest_days_feature_meta src/features/stats_matches.py Same restore / rewrite / delete
tests/unit/test_classification_selection.py _select_best_run src/models/classification.py Refactor moved logic rewrite test against current API
tests/service/test_prediction_service.py — (chain) src/app/config/mlflow.py:4 from data.params import load_params missing src. prefix fix import in production code
tests/test_api.py — (chain) same as above same same fix unblocks both

Phase A — Restore test signal (Day 1, P0)

Goal: green pytest tests/, honest test counts, runnable make test.

T1 — Fix production import bug in src/app/config/mlflow.py (~10 min)

  • Replace from data.params import load_paramsfrom src.data.params import load_params.
  • Verification: python -c "from src.app.config.mlflow import mlflow_settings" succeeds.
  • Side effect: unblocks tests/test_api.py and tests/service/test_prediction_service.py.

T2 — Resolve 3 missing-symbol collection errors (~1.5 h)

  • For each of add_h2h_features, add_rest_days/rest_days_feature_meta, _select_best_run:
  • git log --all -S '<symbol>' -- src/ to find when/why removed.
  • Choose: (a) restore symbol, (b) rewrite test to current API, or (c) delete test + update docs/status.md.
  • Verification: pytest --collect-only -q reports 0 errors.

T3 — Add [tool.pytest.ini_options] to pyproject.toml (~15 min)

[tool.pytest.ini_options]
pythonpath = ["src"]
testpaths = ["tests"]
markers = [
  "slow: tests that take >1s",
  "integration: requires live services",
  "load: locust load tests (excluded by default)",
]
addopts = "-q --strict-markers --tb=short"
- Verification: pytest --collect-only does not warn about unknown markers.

T4 — Add make test* targets to Makefile (~15 min)

test:           pytest -m "not load and not integration"
test-fast:      pytest tests/unit tests/property -q
test-contract:  pytest tests/contract -q
test-coverage:  pytest --cov=src --cov-report=term-missing --cov-report=html
- Verification: make test-fast exits 0.

T5 — Update doc claims (~30 min)

Phase A DoD

  • [ ] pytest --collect-only -q → 0 errors.
  • [ ] pytest tests/ exits green (or marks failures as known with issue links).
  • [ ] All make test* targets work.
  • [ ] No stale ~200 claim remains.
  • [ ] src/app/config/mlflow.py import is correct.

Phase B — Server-side P1 gaps

T6 — Expand tests/test_api.py to all routers

Per router: 1 happy + 1 negative test using TestClient with _get_feature_lookup/_get_predictor overrides. - routers/healthcheck.py, livescores.py, sources.py, stats.py, monitoring.py. - Verification: pytest tests/test_api.py -q.

T7 — CORS env-driven test

Parametrize CORS_ORIGINS env, restart app, assert Access-Control-Allow-Origin header.

T8 — tests/unit/test_api_client.py — mock httpx for src/ui/app/api_client.py

Cover: success, 4xx, 5xx, timeout.

T9 — Helm make helm-test target — helm lint && helm template for chart.

T10 — (Optional, @pytest.mark.integration) MinIO smoke for src/app/data/storage.py.

Phase B DoD

  • [ ] Each src/app/routers/*.py has ≥ 1 happy + 1 negative test.
  • [ ] CORS env behaviour tested.
  • [ ] src/ui/app/api_client.py ≥ 80 % covered.
  • [ ] make helm-test exists and passes.

Phase C — ML / pipeline P1

T11 — Extend tests/contract/test_pipeline_contracts.py to all DVC stages

Currently 10 stages in EXPECTED_STAGES; verify against dvc.yaml and add any missing (tune, final_train, validate_finished, validate_future, ablation).

T12 — tests/unit/test_tuning.py — Optuna 1-trial smoke for src/models/tuning.py (deterministic seed, mock MLflow).

T13 — tests/unit/test_register_model.pysrc/pipelines/register_model.py with mlflow.MlflowClient mock; assert tags, alias, model URI.

T14 — (Optional CI) dvc repro -P on a tiny fixture dataset.

Phase C DoD

  • [ ] Every stage in dvc.yaml covered by contract tests.
  • [ ] Optuna smoke test exists.
  • [ ] Model registration unit-tested with MLflow mock.

Phase D — UI + Airflow

T15 — Streamlit smoke via streamlit.testing.v1.AppTest

  • One smoke per page in src/ui/app/pages/.
  • Verification: pytest tests/unit/test_ui_smoke.py.

T16 — Airflow DagBag validation

  • tests/unit/test_dag_validity.py: DagBag(include_examples=False).import_errors == {} and each DAG has default_args.retries >= 1.

Phase D DoD

  • [ ] Each src/ui/app/pages/*.py smoke-tested.
  • [ ] airflow/dags/** validated via DagBag, no import errors.

Phase E — P2/P3 expansion (ongoing)

  • T17 — Scraper snapshot tests for src/app/scraper/driver.py (recorded HTML fixtures).
  • T18 — Units for src/data/source.py, src/data/storage.py.
  • T19 — Unit for src/features/select.py.
  • T20 — Property tests for src/data/preprocess.py invariants.
  • T21 — Coverage gating: --cov-fail-under=70 once Phase B–D land.
  • T22 — Mutation testing pilot (mutmut run --paths-to-mutate src/features/elo.py).
  • T23 — Create canonical docs/testing.md consolidating conventions (link, do not inline rules from tests.instructions.md).
  • T24 — Tests for src/data_quality/{finished,future}.py.
  • T25 — Tests for additional schemas (models.py, validate.py).

Phase F — CI integration

  • T26 — .gitlab-ci.yml matrix: test:fast, test:contract, test:helm, test:dvc-smoke.
  • T27 — Coverage badge + branch protection on main.

Doc drift (D-Tn)

ID File Claim Reality Action
D-T1 docs/status.md:61 "~200 tests" 274 collected, 5 errors update count + add "5 errors" caveat until Phase A complete
D-T2 docs/quickstart.md:168 "(~200 tests)" same update
D-T3 docs/cicd/testing-strategy.md:82 "Suggested test coverage map" partial map does not list routers/UI/DAGs extend per gap matrix

Definition of Done — overall

  • [ ] DoD-T1 pytest tests/ green, no collection errors.
  • [ ] DoD-T2 Every src/app/routers/*.py has ≥ 1 happy + 1 negative test.
  • [ ] DoD-T3 Every stage in dvc.yaml is in EXPECTED_STAGES.
  • [ ] DoD-T4 Every src/ui/app/pages/*.py collects via AppTest without exception.
  • [ ] DoD-T5 airflow/dags/** validated via DagBag with 0 import errors.
  • [ ] DoD-T6 helm lint + helm template green in CI.
  • [ ] DoD-T7 make test exists and is referenced in docs/quickstart.md.
  • [ ] DoD-T8 docs/testing.md exists with current inventory + link to tests.instructions.md.
  • [ ] DoD-T9 Coverage ≥ 70 % on src/ (gated in CI after Phase E).

2-day sprint slice

If only 2 days are budgeted: T1, T2, T3, T4, T5, T6, T11, T16. This restores signal, fixes the production import bug, gives all routers basic coverage, completes the contract-test surface, and validates DAGs.