Deployment View¶

This page describes where the system runs, how components are physically distributed, and how traffic flows from the internet to individual services.

Physical Topology¶

flowchart TB subgraph Internet[Public Internet] User[End User] end subgraph ExtVPS[External VPS — time2bet.ru] StreamlitUI[Streamlit Web UI] end subgraph ExtSelenoid[External Host — Selenoid] SelenoidGrid[Selenoid Browser Grid] end subgraph HealServer[VPS — healserver — single node] HostNginx[Host-level Nginx\nTLS termination — port 443] subgraph K8s[Kubernetes — single-node cluster] subgraph NS_Ingress[namespace: ingress-nginx] Ingress[Nginx Ingress Controller\nNodePort 31390] end subgraph NS_DS[namespace: ds] Airflow[Airflow\nScheduler + Workers] PG[PostgreSQL] MinIO[MinIO S3] MLflow[MLflow\nTracking + Registry] Prom[Prometheus] Graf[Grafana\n📋 dashboards planned] end subgraph NS_Soccer[namespace: soccer-api] API[FastAPI\nInference Service] MQ[RabbitMQ] WorkerAPI[Celery worker-api] WorkerML[Celery worker-ml] Redis[Redis Cache] end subgraph NS_Mon[namespace: monitoring] KSM[kube-state-metrics] NE[node-exporter] end end end User -->|HTTPS| ExtVPS ExtVPS -->|HTTPS /predict| HostNginx User -->|HTTPS /predict direct| HostNginx HostNginx -->|NodePort 31390| Ingress Ingress -->|/predict, /healthcheck, /metrics| API API -->|enqueue task| MQ MQ --> WorkerAPI MQ --> WorkerML WorkerAPI -->|browser session| SelenoidGrid WorkerAPI --> PG WorkerAPI --> Redis WorkerML --> Redis PG --> MinIO MinIO -.->|dvc pull| MLpipeline[Offline ML Pipeline\nCI / local] MLpipeline --> MLflow API -->|model_uri| MLflow KSM --> Prom NE --> Prom API --> Prom WorkerAPI --> Prom WorkerML --> Prom Prom --> Graf

Namespace Layout¶

Namespace	Services	Purpose
`ingress-nginx`	Nginx Ingress Controller	Routes inbound traffic to cluster services by hostname/path
`ds`	Airflow, PostgreSQL, MinIO, MLflow, Prometheus, Grafana	Data platform and ML infrastructure
`soccer-api`	FastAPI, RabbitMQ, Celery worker-api, Celery worker-ml, Redis	Inference service and async task infrastructure
`monitoring`	kube-state-metrics, node-exporter	K8s cluster and host-level metrics

Ingress Path¶

Traffic from the public internet follows this path:

Internet
  → host-level Nginx (port 443, TLS termination, VPS)
    → K8s NodePort 31390
      → Nginx Ingress Controller (namespace: ingress-nginx)
        → FastAPI service (namespace: soccer-api)

Key notes: - TLS is terminated at the host-level Nginx, which acts as a reverse proxy to the K8s NodePort. - The Ingress Controller routes requests to services by hostname and path prefix. - No service in ds or monitoring is publicly exposed; internal-cluster access only.

External Services¶

Service	Host	Role	K8s integration
Selenoid Browser Grid	Dedicated external host	Headless Chrome sessions for WhoScored scraping	Called by `celery-worker-api` over HTTP; not inside K8s cluster
Streamlit Web UI	External VPS (`time2bet.ru`)	User-facing prediction interface	Calls FastAPI over public HTTPS; no direct cluster access
GitLab CI/CD	GitLab.com SaaS	Build, test, and deploy pipeline	Pushes Helm charts and secrets to healserver via SSH

Helm Chart Structure¶

All Kubernetes resources are managed via Helm charts in k8s/helm/.

k8s/helm/
  soccer-api/        — FastAPI + Celery + RabbitMQ + Redis
  airflow/           — Airflow deployment (custom values)
  monitoring/        — Prometheus + Grafana + exporters
  ...

Secrets are provided as SOPS-encrypted Helm values files (values-*.enc.yaml). CI decrypts them at deploy time using the age private key from a protected CI variable.

Deployment Constraints¶

These constraints are architectural facts, not temporary limitations:

Constraint	Architectural consequence
Single-node Kubernetes	No pod rescheduling across nodes; node failure is a full-service outage. Designed for portfolio/demo scale.
No High Availability	No replicated control plane; no multi-node worker pool. Accepted tradeoff against infrastructure cost.
Self-hosted VPS	Full operational responsibility: K8s upgrades, disk management, TLS renewal, backup.
External Selenoid host	Browser automation is outside the cluster network boundary; an independent failure domain not covered by K8s health probes.
Single RabbitMQ broker	Message queue is a single point of failure for the inference path. Acceptable at current throughput.

These constraints are documented explicitly because they affect reasoning about failure modes, scaling, and future migration.

Known Limitations¶

Limitation	Impact	Mitigation
Single-node K8s cluster	No HA; node failure = full outage	Manual recovery via runbook; acceptable for portfolio scope
No cluster autoscaling	Cannot scale under load	Workload is light; manual scaling if needed
Selenoid runs outside K8s	Separate ops boundary; no K8s health probes	Monitored externally; scraping failures surface via Airflow
Single RabbitMQ broker	No message queue HA	Acceptable at current throughput; documented as known limit
No automated certificate renewal (if LE not configured)	TLS certificate expiry	Operator runbook; or Let's Encrypt with certbot/cert-manager

Portability Note¶

The Helm charts are parameterized with no hardcoded values specific to healserver. Migration to a managed Kubernetes cluster (GKE, EKS, AKS) requires:

Update DNS and TLS entries in chart values.
Replace MinIO with cloud object storage (update DVC remote config).
Replace self-managed PostgreSQL with a managed instance if desired.
Re-encrypt SOPS secrets with an updated age key.

No code changes are required.

System Boundary — what is inside vs outside the cluster
Container View — logical container responsibilities
Security — TLS, namespace isolation, and secret injection
Failure Modes — deployment-level failure scenarios