spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Yang <yangjun...@gmail.com>
Subject Re: Question about Spark Streaming Receiver Failure
Date Mon, 16 Mar 2015 07:33:33 GMT
Akhil,

I have checked the logs. There isn't any clue as to why the 5 receivers
failed.

That's why I just take it for granted that it will be  a common issue for
receiver failures, and we need to figure out a way to detect this kind of
failure and do fail-over.

Thanks

On Mon, Mar 16, 2015 at 3:17 PM, Akhil Das <akhil@sigmoidanalytics.com>
wrote:

> You need to figure out why the receivers failed in the first place. Look
> in your worker logs and see what really happened. When you run a streaming
> job continuously for longer period mostly there'll be a lot of logs (you
> can enable log rotation etc.) and if you are doing a groupBy, join, etc
> type of operations, then there will be a lot of shuffle data. So You need
> to check in the worker logs and see what happened (whether DISK full etc.),
> We have streaming pipelines running for weeks without having any issues.
>
> Thanks
> Best Regards
>
> On Mon, Mar 16, 2015 at 12:40 PM, Jun Yang <yangjunpro@gmail.com> wrote:
>
>> Guys,
>>
>> We have a project which builds upon Spark streaming.
>>
>> We use Kafka as the input stream, and create 5 receivers.
>>
>> When this application runs for around 90 hour, all the 5 receivers failed
>> for some unknown reasons.
>>
>> In my understanding, it is not guaranteed that Spark streaming receiver
>> will do fault recovery automatically.
>>
>> So I just want to figure out a way for doing fault-recovery to deal with
>> receiver failure.
>>
>> There is a JIRA post mentioned using StreamingLister for monitoring the
>> status of receiver:
>>
>>
>> https://issues.apache.org/jira/browse/SPARK-2381?focusedCommentId=14056836&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14056836
>>
>> However I haven't found any open doc about how to do this stuff.
>>
>> Any guys have met the same issue and deal with it?
>>
>> Our environment:
>>    Spark 1.3.0
>>    Dual Master Configuration
>>    Kafka 0.8.2
>>
>> Thanks
>>
>> --
>> yangjunpro@gmail.com
>> http://hi.baidu.com/yjpro
>>
>
>


-- 
yangjunpro@gmail.com
http://hi.baidu.com/yjpro

Mime
View raw message