SoccerPredictAI is a layered ML system with hybrid inference — a production-style MLOps
platform connecting scheduled data acquisition, versioned datasets, reproducible training,
experiment tracking, and sync/async serving through explicit contracts at every boundary.
Layered end-to-end MLOps system with hybrid sync + async inference
Offline / online separation
DVC pipeline (offline) and FastAPI + Celery (online) are independent execution environments sharing contracts and feature logic, not runtime infrastructure
Orchestration model
Calendar-driven ingestion (Airflow) + artifact-driven ML pipeline (DVC) — each tool used for its native purpose
Deployment model
Self-hosted single-node Kubernetes; stateless services with stateful storage managed by K8s
Contract discipline
Every subsystem boundary has a formal, tested contract: Great Expectations (data), MLflow signature (model), Pydantic (API)
These are current limitations of the system as deployed. They are documented explicitly to avoid
presenting a planned or aspirational design as the current runtime reality.
Limitation
Detail
Single-node Kubernetes
All services run on one VPS (healserver). A node failure is a full-service outage. No pod rescheduling across nodes is possible.
No High Availability
No replicated control plane. No multi-node worker pool. Accepted tradeoff against infrastructure cost and operational overhead.
Single RabbitMQ broker
The message queue has no replication or clustering. RabbitMQ unavailability blocks all inference (sync and async).
No autoscaling
No Horizontal Pod Autoscaler is configured. Celery worker replicas are static. Cannot scale under unexpected load.
Manual model promotion
Models are promoted to champion alias manually after review. No automated promotion policy or evaluation gate exists today.
No API authentication
All public endpoints are unauthenticated. Access control is limited to network-level TLS termination and operator-managed exposure.
Cache invalidation not tied to model promotion
When a new model is promoted to champion, existing Redis cache entries are not flushed. Stale predictions from the previous model may be served until TTL expires.