Chapter 58
AI ইনফ্রাস্ট্রাকচার
AI Infrastructure
🏗️ AI চালানোর পেছনের পুরো stack
AI infrastructure = compute + storage + network + orchestration + MLOps tooling। ভালো infra ছাড়া best model-ও production-এ ব্যর্থ।
Compute Layer
- GPU: NVIDIA A100/H100/B200 — training; L4/T4/L40 — inference।
- TPU: Google — JAX/TF-এ excellent।
- Inferentia/Trainium: AWS custom silicon।
- CPU: light model, embedding — ARM Graviton cost-effective।
Storage Layer
- Object store: S3, GCS — dataset, model artifact।
- Parallel FS: Lustre, WekaIO — training-এ high throughput।
- Local NVMe: dataloader-এর hot cache।
- Vector DB: Pinecone, Weaviate, Qdrant, pgvector।
- Feature Store: Feast, Tecton।
- Warehouse: BigQuery, Snowflake, Databricks।
Network
- NVLink/NVSwitch — intra-node GPU-GPU।
- InfiniBand HDR/NDR — inter-node, 200–400 Gbps।
- CDN — model artifact + static asset।
Orchestration — Kubernetes
# Inference pod (GPU)
apiVersion: apps/v1
kind: Deployment
spec:
template:
spec:
containers:
- name: llm
image: ghcr.io/org/llm:1.4
resources:
limits: { nvidia.com/gpu: 1 }
readinessProbe:
httpGet: { path: /health, port: 8000 }Key Add-ons
- KServe / Seldon: model serving on K8s।
- Kubeflow: ML workflows, training operator।
- Argo Workflows: pipeline DAG।
- Karpenter: intelligent autoscaler।
- NVIDIA GPU Operator: driver + MIG + DCGM।
MLOps Tooling
- Experiment: MLflow, Weights & Biases, Neptune।
- Data versioning: DVC, LakeFS।
- Pipeline: Airflow, Prefect, Dagster।
- CI/CD: GitHub Actions, GitLab CI, Buildkite।
- Secrets: Vault, AWS Secrets Manager।
Build vs Buy
- Buy (Managed): SageMaker, Vertex AI, Modal, Replicate, Anyscale — fast, costly।
- Build (Self-host): K8s + GPU node + open source stack — cheap at scale, ops-heavy।
- Hybrid: training self-host (cheap), serving managed (uptime)।
Security & Governance
- VPC isolation, private endpoint।
- IAM least-privilege।
- PII redaction in logs।
- Model card + data lineage।
- EU AI Act, SOC2, HIPAA compliance।
💡 Stack ছোট শুরু করুন
Day-1 stack: Postgres + S3 + 1 GPU VM + FastAPI + Docker + Grafana। Scale-এর সাথে Kafka, vector DB, K8s, Kubeflow যোগ হবে।
সারসংক্ষেপ
✨ এই অধ্যায়ে যা শিখলাম
- Compute, storage, network, orchestration — চার pillar।
- K8s + GPU Operator + KServe — modern serving infra।
- MLOps tool — experiment থেকে production পর্যন্ত।
- Build vs Buy — context-dependent trade-off।