Skip to content

Backfills & Freshness Policy

Data freshness

Freshness requirements vary by dataset:

  • recent matches require timely updates,
  • historical data changes rarely.

Freshness is monitored and alerts are triggered on delays.


Backfills

Backfills are required when:

  • scraping logic changes,
  • historical corrections are detected,
  • schema evolution requires reprocessing.

Backfill strategy

  • backfills are executed via controlled Airflow runs,
  • outputs are exported as new dataset versions,
  • downstream ML pipelines are explicitly re-run.

Safety rules

  • no in-place mutation of existing datasets,
  • backfills always produce new DVC versions,
  • changes are documented and traceable.