Serving¶
Application Entry-point¶
prometheus_metrics()
¶
Prometheus scrape endpoint.
Uses MultiProcessCollector when PROMETHEUS_MULTIPROC_DIR is set
(required for Gunicorn multi-worker deployments).
Source code in src/app/main.py
Database¶
Dependencies¶
get_token_header(x_api_key=None)
async
¶
Validate the X-API-Key request header.
Uses hmac.compare_digest for constant-time comparison to prevent
timing-based key enumeration attacks. Returns 401 when the header is
absent or does not match the configured secret (FASTAPI_HEADER_TOKEN).
Source code in src/app/dependencies.py
get_query_token(token=None)
async
¶
Validate the token query parameter.
Source code in src/app/dependencies.py
Routers¶
list_upcoming_matches(lookup)
async
¶
Return upcoming matches available for prediction (from batch inference output).
Each item includes match_id, teams, and date. Use these IDs with
GET /predict/{match_id}.
Source code in src/app/routers/predict.py
predict(request, stage)
async
¶
Synchronous 1×2 outcome prediction from provided features.
Submits the task to the ml Celery worker and blocks until the result
is ready (up to _SYNC_TIMEOUT seconds).
Use ?stage=challenger to use the challenger model (requires
MLFLOW_MODEL_STAGES=champion,challenger on the worker).
Source code in src/app/routers/predict.py
predict_precomputed(match_id, pred_lookup)
async
¶
Return the precomputed prediction for a match from predictions.parquet.
Predictions are produced by the batch_inference DVC stage which runs
model.predict() over all matches and saves the result to MinIO.
This endpoint reads directly from the in-memory cache — no Celery task,
no MLflow model call at request time.
Returns 404 if the match is not found in the latest batch output.
Source code in src/app/routers/predict.py
predict_by_match_id(match_id, lookup, stage)
async
¶
Predict outcome for a match using precomputed features.
Features are read from data/predictions/match_features.parquet
produced by the batch_inference DVC stage.
Submits the task to the ml Celery worker and blocks until the result
is ready. Returns 404 if match_id is not in the current batch output.
Use ?stage=challenger to use the challenger model (requires
MLFLOW_MODEL_STAGES=champion,challenger on the worker).
Source code in src/app/routers/predict.py
model_info(stage)
async
¶
Return the current model's metadata from the MLflow Model Registry.
Delegates to the ml Celery worker where MLflow is available.
Includes version, stage, run metrics, and the input feature schema.
Source code in src/app/routers/predict.py
predict_async(request, lookup)
async
¶
Submit a 1×2 prediction to the Celery queue.
Returns a task_id immediately. Poll
GET /monitoring/task_status/{task_id} until status == 'SUCCESS'
to retrieve the result.
Returns 404 if the match has no precomputed features.
Source code in src/app/routers/predict.py
get_drift_status()
¶
Return the latest feature drift summary and refresh the Prometheus gauge.
The report is written by the monitor_drift DVC stage.
If the report does not yet exist, returns drift_score=null.
Source code in src/app/routers/monitoring.py
get_queue_stats()
¶
Return active/scheduled/reserved task counts and worker stats.
Source code in src/app/routers/monitoring.py
get_workers()
¶
Return active queues and ping status for all connected workers.
Source code in src/app/routers/monitoring.py
Services¶
FeatureLookupService
¶
Loads precomputed features for all matches from the batch inference output.
The parquet file is produced by the batch_inference DVC stage and has
the match id as its index. It contains both upcoming matches and
finished matches (with outcome_1x2, homeScore, awayScore).
The service is loaded lazily on the first call and cached in-process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features_path
|
Path | None
|
Absolute path to |
None
|
Source code in src/app/services/predict.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 | |
features_computed_at
property
¶
UTC datetime when the feature file was last written (= last batch_inference run).
get_features(match_id)
¶
Return the feature dict for match_id, or None if not found.
Source code in src/app/services/predict.py
list_matches()
¶
Return a lightweight list of upcoming matches for UI display.
Source code in src/app/services/predict.py
PredictionLookupService
¶
Serves precomputed batch predictions from the batch_inference DVC stage output.
Mirrors the FeatureLookupService caching pattern:
- Checks local file first (dev / CI).
- Falls back to MinIO with a configurable re-check interval.
The parquet file is indexed by match id and must contain columns:
proba_home, proba_draw, proba_away, predicted_class,
predicted_label, optionally is_future, startTimeUtc,
homeTeamName, awayTeamName, model_run_id, model_stage.
Source code in src/app/services/predict.py
191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 | |
predictions_computed_at
property
¶
UTC datetime when predictions.parquet was last written.
get_prediction(match_id)
¶
Return the prediction dict for match_id, or None if not found.
Source code in src/app/services/predict.py
list_matches()
¶
Return all rows as a list of dicts (for diagnostics/admin endpoints).
PredictionService
¶
Loads and serves a model from the MLflow Model Registry.
The model is loaded lazily on the first call to predict and then
cached in-process for the lifetime of the worker.
Source code in src/app/services/predict.py
324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 | |
load()
¶
Eagerly load the model. Safe to call multiple times (idempotent).
Call this during application startup (e.g. FastAPI lifespan or
Celery worker_process_init) to avoid paying the cold-start
penalty on the first user request.
Source code in src/app/services/predict.py
get_model_info()
¶
Return model metadata from the MLflow Model Registry.
Queries the registered model for the configured stage, then fetches run metrics and params. Does NOT require the model to be loaded.
Source code in src/app/services/predict.py
predict(features, match_id=None, features_computed_at=None)
¶
Run inference for a single match.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
dict
|
Feature dict matching model input schema. |
required |
match_id
|
int | None
|
Optional identifier for downstream tracing. |
None
|
features_computed_at
|
datetime | None
|
UTC timestamp when features were produced (batch_inference stage). Stored in the response for traceability. |
None
|
Returns:
| Type | Description |
|---|---|
dict
|
Dict compatible with |
Source code in src/app/services/predict.py
549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 | |
Schemas¶
PredictRequest
¶
Bases: BaseModel
Input features for a single match prediction.
Features must match the model's training schema exactly. The model expects rolling-window difference features (side='diff', window=5) plus a categorical 'sex' column (0=men, 1=women).
Source code in src/app/schemas/predict.py
AsyncPredictRequest
¶
AsyncPredictResponse
¶
Bases: BaseModel
Returned immediately after submitting an async prediction task.
Source code in src/app/schemas/predict.py
ModelInfoResponse
¶
Bases: BaseModel
MLflow model metadata returned by GET /predict/model/info.
Source code in src/app/schemas/predict.py
PrecomputedPredictResponse
¶
Bases: BaseModel
Response for GET /predict/precomputed/{match_id}.
Served directly from predictions.parquet produced by the batch_inference DVC stage — no Celery task, no MLflow model call at request time.
Source code in src/app/schemas/predict.py
MatchRawLive
¶
Bases: BaseModel
Projected subset of MatchRaw for live-scores display.
Source code in src/app/schemas/models.py
Celery Tasks¶
Celery task: asynchronous match outcome prediction.
Submitted by POST /predict/async/ and executed by the ml Celery worker.
The result is stored in the Celery result backend and can be retrieved via
GET /monitoring/task_status/{task_id}.
The task result has the same shape as PredictResponse so the Streamlit
polling page can display it directly.
Architecture note¶
The PredictionService is initialised once per worker process via the
worker_process_init Celery signal. This avoids loading the MLflow model
(potentially hundreds of MB from MinIO) on every task invocation.
predict_match(self, match_id, features, features_computed_at=None, model_stage=None)
¶
Run 1×2 inference for match_id using pre-computed features.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
match_id
|
int
|
Identifier used for logging and response tracing. |
required |
features
|
dict
|
Feature dict matching the model's input schema, produced by
the |
required |
features_computed_at
|
str | None
|
ISO-8601 UTC string of when the features were computed (batch_inference mtime). Stored in the response for end-to-end traceability. |
None
|
model_stage
|
str | None
|
MLflow alias/stage to use (e.g. "champion", "challenger").
Defaults to |
None
|
Returns:
| Type | Description |
|---|---|
dict
|
Dict compatible with |
Source code in src/app/tasks/predict.py
get_model_info(self, model_stage=None)
¶
Retrieve MLflow model metadata from the registry.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_stage
|
str | None
|
Stage/alias to query. Defaults to |
None
|
Returns:
| Type | Description |
|---|---|
dict
|
Dict compatible with |
Source code in src/app/tasks/predict.py
export_data_raw(self, name_table)
¶
Export data from a database table to a Parquet file in MinIO.
Source code in src/app/tasks/export.py
Prometheus Metrics¶
Prometheus metrics registry for the SoccerPredictAI service.
All metric objects are defined here as module-level singletons so they are shared across the FastAPI app and Celery worker within the same process.
Gunicorn multiprocess note¶
When running under Gunicorn with multiple workers, prometheus-client
requires PROMETHEUS_MULTIPROC_DIR to be set to a writable directory.
The /metrics endpoint in main.py uses MultiProcessCollector
automatically when that variable is present.