🟣 Terminology: Batch vs Streaming Processing
Batch: Process accumulated data at scheduled intervals. - When: reports, model training, ETL jobs - Tools: Spark, Hadoop, Airflow - Latency: minutes to hours
Streaming: Process data in real-time as it arrives. - When: fraud detection, live dashboards, recommendation updates - Tools: Kafka, Flink, Spark Streaming - Latency: milliseconds to seconds
Practice Questions
Q: You're building a dashboard that shows the CEO daily revenue summaries. Batch or streaming?