spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JF Chen <darou...@gmail.com>
Subject Re: "java.lang.AssertionError: assertion failed: Failed to get records for **** after polling for 180000" error
Date Wed, 06 Mar 2019 08:45:47 GMT
Hi
The max bytes setting should be enough, because if the tasks fail, it read
the data from kafka very fast as normal.
The   request.timeout.ms  I set is 180 seconds.
I think it should be time out setting or max  bandwidth setting because of
the reason that it recoveries and read the same partition very fast after
the tasks are marked failed.

Regard,
Junfeng Chen


On Wed, Mar 6, 2019 at 4:01 PM Akshay Bhardwaj <
akshay.bhardwaj1988@gmail.com> wrote:

> Sorry message sent as incomplete.
>
> To better debug the issue, please check the below config properties:
>
>    - At Kafka consumer properties
>       - max.partition.fetch.bytes within spark kafka consumer. If not set
>       for consumer then the global config at broker level.
>       - request.timeout.ms
>    - At spark's configurations
>       - spark.streaming.kafka.consumer.poll.ms
>       - spark.network.timeout (If the above is not set, then poll.ms is
>       default to spark.network.timeout)
>
>
> Generally I have faced this issue if spark.streaming.kafka.
> consumer.poll.ms is less than request.timeout.ms
>
> Also, what is the average kafka record message size in bytes?
>
>
>
> Akshay Bhardwaj
> +91-97111-33849
>
>
> On Wed, Mar 6, 2019 at 1:26 PM Akshay Bhardwaj <
> akshay.bhardwaj1988@gmail.com> wrote:
>
>> Hi,
>>
>> To better debug the issue, please check the below config properties:
>>
>>    - max.partition.fetch.bytes within spark kafka consumer. If not set
>>    for consumer then the global config at broker level.
>>    - spark.streaming.kafka.consumer.poll.ms
>>       - spark.network.timeout (If the above is not set, then poll.ms is
>>       default to spark.network.timeout)
>>    -
>>    -
>>
>> Akshay Bhardwaj
>> +91-97111-33849
>>
>>
>> On Wed, Mar 6, 2019 at 8:39 AM JF Chen <darouwan@gmail.com> wrote:
>>
>>> When my kafka executor reads data from kafka, sometimes it throws the
>>> error "java.lang.AssertionError: assertion failed: Failed to get records
>>> for **** after polling for 180000" , which after 3 minutes of executing.
>>> The data waiting for read is not so huge, which is about 1GB. And other
>>> partitions read by other tasks are very fast, the error always occurs on
>>> some specific executor..
>>>
>>> Regard,
>>> Junfeng Chen
>>>
>>

Mime
View raw message