Chapter 56
রিয়েল-টাইম AI পাইপলাইন
Real-time AI Pipelines
⏱️ Data যত fresh, decision তত smart
Real-time pipeline — fraud detection, recommendation, anomaly alert, live transcription — সবই sub-second latency-তে চলে।
Architecture Blueprint
Source (App, IoT, Click)
│
▼
[Kafka / Kinesis / Pulsar] ← event bus
│
├─► [Stream Processor] (Flink, Spark Structured Streaming)
│ │
│ ▼
│ Feature Store (online) ← Feast, Tecton
│
├─► [Model Service] ←─ reads features
│
└─► [Sink]: DB, dashboard, alertEvent Streaming — Kafka
from confluent_kafka import Producer, Consumer
p = Producer({"bootstrap.servers": "localhost:9092"})
p.produce("clicks", key="u123", value='{"item":"a1"}')
p.flush()Stream Processing — Flink
- Windowing — tumbling, sliding, session।
- Exactly-once semantics।
- State backend (RocksDB)।
- Watermark দিয়ে late event handle।
Online Feature Store
Training-এ যা feature ব্যবহার হয়েছে, inference-এ সেগুলোই millisecond-এ পাওয়া যায় — Redis/DynamoDB-এ। Train/serve skew রোধ করে।
# Feast — feature retrieval at inference
features = store.get_online_features(
features=["user:avg_basket", "user:last_click_cat"],
entity_rows=[{"user_id": "u123"}]
).to_dict()Latency Budget উদাহরণ
Total SLA: 100 ms
Network in/out : 20 ms
Feature fetch : 15 ms
Model inference : 40 ms
Postprocess + log : 10 ms
Buffer : 15 msBackpressure & Reliability
- Consumer lag monitor (Burrow, Grafana)।
- Dead Letter Queue — corrupt event isolate।
- Idempotent consumer — duplicate-safe।
- Checkpoint + replay।
Real-time AI Use Cases
- Card fraud — <100 ms decision।
- Ride pricing & ETA।
- Live captioning, translation।
- Recommendation refresh on each click।
- Industrial anomaly detection।
💡 Online + Offline ঐক্য
একই feature definition online (Redis) এবং offline (warehouse) — দুই জায়গায়। Feature store-এর মূল কাজ এটাই।
সারসংক্ষেপ
✨ এই অধ্যায়ে যা শিখলাম
- Kafka + Flink + Feature Store — মূল triad।
- Latency budget বানিয়ে design।
- Backpressure, DLQ, idempotency — reliability।