spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerard Maas <>
Subject Re: Data Loss - Spark streaming
Date Tue, 16 Dec 2014 12:12:43 GMT
Hi Jeniba,

The second part of this meetup recording has a very good answer to your
question.  TD explains the current behavior and the on-going work in Spark
Streaming to fix HA.

-kr, Gerard.

On Tue, Dec 16, 2014 at 11:32 AM, Jeniba Johnson <> wrote:
> Hi,
> I need a clarification, while running streaming examples, suppose the
> batch interval is set to 5 minutes, after collecting the data from the
> input source(FLUME) and  processing till 5 minutes.
> What will happen to the data which is flowing continuously from the input
> source to spark streaming ? Will that data be stored somewhere or else the
> data will be lost ?
> Or else what is the solution to capture each and every data without any
> loss in Spark streaming.
> Awaiting for your kind reply.
> Regards,
> Jeniba Johnson
> ________________________________
> The contents of this e-mail and any attachment(s) may contain confidential
> or privileged information for the intended recipient(s). Unintended
> recipients are prohibited from taking action on the basis of information in
> this e-mail and using or disseminating the information, and must notify the
> sender and delete it from their system. L&T Infotech will not accept
> responsibility or liability for the accuracy or completeness of, or the
> presence of any virus or disabling code in this e-mail"

View raw message