spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lalit1303 <>
Subject Re: Spark streaming at-least once guarantee
Date Wed, 06 Aug 2014 07:13:57 GMT
Hi TD,

Thanks a lot for your reply :)
I am already looking into creating a new DStream for SQS messages. It would
be very helpful if you can provide with some guidance regarding the same.

The main motive of integrating SQS with spark streaming is to make my Jobs
run in high availability.
As of now I am having a downloader, which downloads file pointed by SQS
messages and the my spark streaming job comes in action to process them. I
am planning to move whole architecture into high availability (spark
streaming job can easily be shifted to high availability), only piece left
is integrate SQS with spark streaming such that it can automatically recover
master node failure. Also, I want to make a single pipeline, start from
getting SQS message to the processing of corresponding file.

I couldn't think of any other approach to make my SQS downloader run in high
availability mode. The only thing I have to get, is create a Dstream which
reads sqs messages from the corresponding queue.

Please let me know if there is any other work around.

-- Lalit

Lalit Yadav
View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message