Skip to content

Configuration Reference (Hydra)

Time2Bet uses Hydra to manage configuration for: - data paths and dataset selection, - feature and training parameters, - evaluation behavior, - model registration settings, - environment-dependent overrides.

Hydra configuration is treated as part of the reproducibility story: every run logs its full resolved config.


Hydra is typically structured as:

  • configs/
  • config.yaml (root/defaults)
  • data/ (dataset sources, versions, filters)
  • features/ (feature flags, windows)
  • model/ (model type + hyperparams)
  • train/ (splits, seeds, CV strategy)
  • eval/ (metrics, reports)
  • registry/ (MLflow model name, aliases)
  • env/ (dev/staging/prod overrides)

Key principles

  • Defaults define a safe baseline.
  • Overrides are explicit and traceable.
  • Environment-specific config is isolated (no hidden branching logic).

Common usage patterns

Run with defaults

python -m src.pipelines.train
````

### Override model type

```bash
python -m src.pipelines.train model=xgboost

Override train seed and split date

python -m src.pipelines.train train.seed=42 train.cutoff_date=2026-01-01

Environment override

python -m src.pipelines.train env=prod

What gets logged to MLflow

Each run logs:

  • resolved Hydra config (as YAML artifact)
  • git commit hash
  • DVC dataset revision
  • parameters and metrics

This enables full traceability.


  • ML → Training Pipeline
  • ML → Model Registry
  • Data → Dataset Versioning