System Context (C4 — Level 1)¶
This diagram defines the system boundary, external actors, and integrations. For the detailed boundary analysis see System Boundary. For physical deployment topology see Deployment View.
Context Diagram¶
Actors and Roles¶
End User / Viewer¶
Consumes match outcome predictions via the Streamlit web interface hosted on an independent external VPS (time2bet.ru).
Has no direct access to internal cluster services.
System Operator¶
Deploys, monitors, and maintains the system.
Has SSH access to healserver, access to GitLab CI protected variables, and the age private key.
The only human actor with direct cluster access.
WhoScored.com¶
Third-party source of football match statistics. Treated as an untrusted external input — all data is validated via Great Expectations before use. Subject to layout changes, rate limiting, and availability issues outside operator control.
Selenoid Server¶
Dedicated external host running a Selenoid browser grid.
Invoked by celery-worker-api to perform headless browser scraping against WhoScored.
Operator-managed, but runs outside the Kubernetes cluster — a separate operational boundary.
Time2Bet Web UI (Streamlit)¶
User-facing prediction frontend hosted on an external VPS. Calls the inference API over public HTTPS. Outside the system boundary; dependent on API availability.
DVC Pipeline (Offline Execution Context)¶
The ML training pipeline. Runs outside the K8s cluster — locally or in CI — against MinIO (data) and MLflow (model artifacts). Not a runtime component; produces the versioned model artifacts that the serving layer consumes.
GitLab CI/CD¶
Manages build, test, and deployment pipelines. Part of the delivery boundary: pushes Helm deployments to the cluster and handles secret decryption during the deploy phase. Does not participate in the runtime execution path.
System Responsibilities¶
SoccerPredictAI is responsible for:
- scraping and ingesting match data from WhoScored via Selenoid,
- normalizing and storing structured data in PostgreSQL,
- exporting versioned datasets to MinIO via DVC,
- training match outcome prediction models reproducibly,
- tracking experiments and managing model lifecycle in MLflow,
- serving predictions synchronously and asynchronously via FastAPI + Celery,
- exposing service health and Prometheus metrics for observability.
Non-Goals¶
- The system does not guarantee betting profitability.
- It is not a general sports analytics platform.
- It does not support multiple data providers or sports.
- It does not provide user authentication or multi-tenant access control.
Related¶
- System Boundary — detailed inside/outside analysis and trust zones
- Container View — services inside the system boundary
- Deployment View — physical topology and ingress path
- Security — trust boundaries and secret lifecycle