Configuration Reference (Hydra)¶
Time2Bet uses Hydra to manage configuration for: - data paths and dataset selection, - feature and training parameters, - evaluation behavior, - model registration settings, - environment-dependent overrides.
Hydra configuration is treated as part of the reproducibility story: every run logs its full resolved config.
Config structure (recommended)¶
Hydra is typically structured as:
configs/config.yaml(root/defaults)data/(dataset sources, versions, filters)features/(feature flags, windows)model/(model type + hyperparams)train/(splits, seeds, CV strategy)eval/(metrics, reports)registry/(MLflow model name, aliases)env/(dev/staging/prod overrides)
Key principles¶
- Defaults define a safe baseline.
- Overrides are explicit and traceable.
- Environment-specific config is isolated (no hidden branching logic).
Common usage patterns¶
Run with defaults¶
python -m src.pipelines.train
````
### Override model type
```bash
python -m src.pipelines.train model=xgboost
Override train seed and split date¶
Environment override¶
What gets logged to MLflow¶
Each run logs:
- resolved Hydra config (as YAML artifact)
- git commit hash
- DVC dataset revision
- parameters and metrics
This enables full traceability.
Related docs¶
- ML → Training Pipeline
- ML → Model Registry
- Data → Dataset Versioning