Skip to content

Inference API Contract

API principles

The inference API follows these principles: - explicit and versioned request/response schemas, - strict input validation, - clear separation between transport and inference logic.


Core endpoints

GET /healthcheck ✅ Implemented

Health and readiness probe.

Status: Operational

Characteristics: - Returns service status - Memory usage metrics - Worker ID information

Used by: - Kubernetes liveness/readiness checks - External monitoring systems

Example:

curl http://localhost:8000/healthcheck/


POST /predict ✅ Implemented

Synchronous inference endpoint.

Status: Implemented — accepts feature dict, returns 1×2 probabilities from MLflow-tracked model.

Characteristics: - validated input schema (Pydantic) - returns predicted class + per-class probabilities - MLflow run_id included for full traceability

Example:

curl -X POST http://localhost:8000/predict/ \
  -H "Content-Type: application/json" \
  -d '{"match_id": 99, "features": {"diff_win_5_mean": 0.3, "sex": 0}}'

Example response:

{
  "match_id": 99,
  "prediction": {
    "predicted_class": 0,
    "probabilities": {"0": 0.58, "1": 0.27, "2": 0.15},
    "model_version": "Staging",
    "model_run_id": "3f7a1c9d2e4b"
  }
}

Typical use cases: - interactive UI requests - small batch predictions


POST /predict/async 📋 Planned

Asynchronous inference submission.

Status: Celery infrastructure ready, not integrated

Planned Characteristics: - returns a job ID - execution handled by background workers - result retrieved via polling or callback


GET /metrics 📋 Planned

Prometheus-compatible metrics endpoint.

Status: Designed but not implemented


Input schema

  • defined via Pydantic models,
  • aligned with MLflow model signature,
  • rejects unknown or malformed fields.

Input validation failures are treated as client errors.


Output schema

Responses include: - prediction results, - model version metadata, - optional confidence scores.


Versioning strategy

  • API versioning via URL prefix or headers,
  • backward-incompatible changes require a new version,
  • model upgrades do not require API changes if contracts are preserved.