spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Spitzer <russell.spit...@gmail.com>
Subject Re: Is "spark streaming" streaming or mini-batch?
Date Tue, 23 Aug 2016 16:34:57 GMT
Spark streaming does not process 1 event at a time which is in general I
think what people call "Streaming." It instead processes groups of events.
Each group is a "MicroBatch" that gets processed at the same time.

Streaming theoretically always has better latency because the event is
processed as soon as it arrives. While in microbatching the latency of all
the events in the batch can be no better than the last element to arrive.

Streaming theoretically has worse performance because events cannot be
processed in bulk.

In practice throughput and latency are very implementation dependent

On Tue, Aug 23, 2016 at 8:41 AM Aseem Bansal <asmbansal2@gmail.com> wrote:

> I was reading this article https://www.inovex.de/blog/storm-in-a-teacup/
> and it mentioned that spark streaming actually mini-batch not actual
> streaming.
>
> I have not used streaming and I am not sure what is the difference in the
> 2 terms. Hence could not make a judgement myself.
>

Mime
View raw message