Code Structure¶
Directory layout, naming conventions, and where to find what.
Top-Level Layout¶
soccer/
├── src/ # All production Python code
│ ├── data/ # Data access, schemas, splits, storage abstractions
│ ├── features/ # Feature engineering (pure, deterministic)
│ ├── models/ # Models, losses, metrics — NO IO
│ ├── pipelines/ # DVC / CLI orchestration entrypoints
│ └── app/ # FastAPI service layer
│ └── tasks/ # Celery async jobs
├── airflow/
│ └── dags/ # Scheduled production pipelines
├── configs/ # Hydra configuration files
├── data/ # DVC-versioned datasets & artifacts
│ ├── raw/
│ ├── processed/
│ ├── features/
│ ├── splits/
│ ├── models/
│ └── predictions/
├── docs/ # MkDocs documentation
├── reports/ # Quarto reports (EDA, evaluation) — not production
├── tests/ # pytest test suite
├── docker/ # Dockerfiles per service
├── k8s/ # Kubernetes manifests / Helm charts
├── dvc.yaml # DVC pipeline definition
├── params.yaml # DVC / Hydra parameters
└── pyproject.toml # Project metadata and tool configuration
src/ Layer Rules¶
| Layer | Allowed | Forbidden |
|---|---|---|
src/data/ |
DB access, MinIO, schema validation | ML logic, feature code |
src/features/ |
Pure feature transforms | IO, model calls |
src/models/ |
Model classes, metrics, losses | IO of any kind |
src/pipelines/ |
Orchestration, CLI entrypoints | Business logic |
src/app/ |
FastAPI routers, dependency injection | Training, feature engineering |
src/app/tasks/ |
Celery task definitions | Inline ML logic |
Naming Conventions¶
- Files:
snake_case.py - Classes:
PascalCase - Functions / variables:
snake_case - Constants:
UPPER_SNAKE_CASE - Hydra configs:
snake_case.yaml
Configuration¶
- All parameters in
params.yaml(DVC) andconfigs/(Hydra) - No hardcoded paths, seeds, or credentials anywhere in
src/ - Secrets via SOPS + age (see Security)
Tests¶
tests/
├── unit/ # Pure function tests (fast, no IO)
├── integration/ # Tests requiring DB or external services
└── conftest.py # Shared fixtures
Run all tests: pytest tests/
See Testing Strategy for full policy.