Inference API Contract¶

API principles¶

The inference API follows these principles: - explicit and versioned request/response schemas, - strict input validation, - clear separation between transport and inference logic.

Core endpoints¶

`GET /healthcheck` ✅ Implemented¶

Health and readiness probe.

Status: Operational

Characteristics: - Returns service status - Memory usage metrics - Worker ID information

Used by: - Kubernetes liveness/readiness checks - External monitoring systems

Example:

curl http://localhost:8000/healthcheck/

`POST /predict` ✅ Implemented¶

Synchronous inference endpoint.

Status: Implemented — accepts feature dict, returns 1×2 probabilities from MLflow-tracked model.

Characteristics: - validated input schema (Pydantic) - returns predicted class + per-class probabilities - MLflow run_id included for full traceability

Example:

curl -X POST http://localhost:8000/predict/ \
  -H "Content-Type: application/json" \
  -d '{"match_id": 99, "features": {"diff_win_5_mean": 0.3, "sex": 0}}'

Example response:

{
  "match_id": 99,
  "prediction": {
    "predicted_class": 0,
    "probabilities": {"0": 0.58, "1": 0.27, "2": 0.15},
    "model_version": "Staging",
    "model_run_id": "3f7a1c9d2e4b"
  }
}

Typical use cases: - interactive UI requests - small batch predictions

`POST /predict/async` 📋 Planned¶

Asynchronous inference submission.

Status: Celery infrastructure ready, not integrated

Planned Characteristics: - returns a job ID - execution handled by background workers - result retrieved via polling or callback

`GET /metrics` 📋 Planned¶

Prometheus-compatible metrics endpoint.

Status: Designed but not implemented

Input schema¶

defined via Pydantic models,
aligned with MLflow model signature,
rejects unknown or malformed fields.

Input validation failures are treated as client errors.

Output schema¶

Responses include: - prediction results, - model version metadata, - optional confidence scores.

Versioning strategy¶

API versioning via URL prefix or headers,
backward-incompatible changes require a new version,
model upgrades do not require API changes if contracts are preserved.