Backfills & Freshness Policy¶
Data freshness¶
Freshness requirements vary by dataset:
- recent matches require timely updates,
- historical data changes rarely.
Freshness is monitored and alerts are triggered on delays.
Backfills¶
Backfills are required when:
- scraping logic changes,
- historical corrections are detected,
- schema evolution requires reprocessing.
Backfill strategy¶
- backfills are executed via controlled Airflow runs,
- outputs are exported as new dataset versions,
- downstream ML pipelines are explicitly re-run.
Safety rules¶
- no in-place mutation of existing datasets,
- backfills always produce new DVC versions,
- changes are documented and traceable.