Chapter 49

মডেল ডিপ্লয়মেন্ট

REST API: FastAPI/Flask — synchronous request/response।
Batch: ঘণ্টায় একবার score generate (cron/Airflow)।
Streaming: Kafka → real-time inference।
Edge: mobile/IoT device-এ on-device (TFLite, CoreML)।
Serverless: AWS Lambda, Modal, Replicate।

Model Deployment

🚀 Notebook থেকে Production

Model train করা শেষ — এবার সেটাকে real user-এর কাছে পৌঁছানো। Deployment মানে শুধু API নয় — packaging, scaling, monitoring, versioning সবকিছু।

Deployment Patterns

# Popular options
- AWS SageMaker / Bedrock
- GCP Vertex AI
- Azure ML
- HuggingFace Inference Endpoints
- Replicate, Modal, RunPod (GPU-first)

💡 শুরু করুন ছোট

FastAPI + Docker + একটি cloud VM — এটাই 80% project-এ যথেষ্ট। Triton/vLLM তখনই যখন scale বা latency দাবি করে।

✨ এই অধ্যায়ে যা শিখলাম