Chapter 56

রিয়েল-টাইম AI পাইপলাইন

Real-time AI Pipelines

⏱️ Data যত fresh, decision তত smart

Real-time pipeline — fraud detection, recommendation, anomaly alert, live transcription — সবই sub-second latency-তে চলে।

Architecture Blueprint

Source (App, IoT, Click)
   │
   ▼
[Kafka / Kinesis / Pulsar]   ← event bus
   │
   ├─► [Stream Processor]    (Flink, Spark Structured Streaming)
   │       │
   │       ▼
   │   Feature Store (online)  ← Feast, Tecton
   │
   ├─► [Model Service]  ←─ reads features
   │
   └─► [Sink]: DB, dashboard, alert

Event Streaming — Kafka

from confluent_kafka import Producer, Consumer
p = Producer({"bootstrap.servers": "localhost:9092"})
p.produce("clicks", key="u123", value='{"item":"a1"}')
p.flush()

Stream Processing — Flink

Windowing — tumbling, sliding, session।
Exactly-once semantics।
State backend (RocksDB)।
Watermark দিয়ে late event handle।

Online Feature Store

Training-এ যা feature ব্যবহার হয়েছে, inference-এ সেগুলোই millisecond-এ পাওয়া যায় — Redis/DynamoDB-এ। Train/serve skew রোধ করে।

# Feast — feature retrieval at inference
features = store.get_online_features(
    features=["user:avg_basket", "user:last_click_cat"],
    entity_rows=[{"user_id": "u123"}]
).to_dict()

Latency Budget উদাহরণ

Total SLA: 100 ms
  Network in/out      :  20 ms
  Feature fetch       :  15 ms
  Model inference     :  40 ms
  Postprocess + log   :  10 ms
  Buffer              :  15 ms

Backpressure & Reliability

Consumer lag monitor (Burrow, Grafana)।
Dead Letter Queue — corrupt event isolate।
Idempotent consumer — duplicate-safe।
Checkpoint + replay।

Real-time AI Use Cases

Card fraud — <100 ms decision।
Ride pricing & ETA।
Live captioning, translation।
Recommendation refresh on each click।
Industrial anomaly detection।

💡 Online + Offline ঐক্য

একই feature definition online (Redis) এবং offline (warehouse) — দুই জায়গায়। Feature store-এর মূল কাজ এটাই।

সারসংক্ষেপ

✨ এই অধ্যায়ে যা শিখলাম

Kafka + Flink + Feature Store — মূল triad।
Latency budget বানিয়ে design।
Backpressure, DLQ, idempotency — reliability।

পূর্ববর্তী

স্কেলেবল AI সিস্টেম

পরবর্তী

ডিস্ট্রিবিউটেড ট্রেনিং