kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damian Guy <damian....@gmail.com>
Subject Re: Getting thread deadlock in streams 10.2.0 if a partition is re-assigned to a different thread in same instance
Date Thu, 02 Mar 2017 15:35:37 GMT
Hi Sachin,

This is because another thread hasn't released the lock. It is generally
caused because it is taking a long time to restore or process the incoming
records. It might be helpful to take a Thread dump so we can see what each
thread is doing.

Thanks,
Damian

On Thu, 2 Mar 2017 at 12:14 Sachin Mittal <sjmittal@gmail.com> wrote:

> Hi,
> We are getting some deadlock state after re-balance and what we found that
> it happens after re-balance the partition is assigned to a different thread
> in same instance.
>
> Looks like previous thread is not releasing the rocks db lock.
>
> Here are the complete logs spanned across multiple files.
>
>
> https://drive.google.com/open?id=0B-KsMy4P1bGxLTVVVWlZOGdQZTUzYm9aV3lfOFdubTJ2X2JR
>
> I am just posting the gist here as how we understood this issue.
>
>
> *===== TestKafkaAdvice.2017-03-01.89.log ========*
>
> 53352 INFO 2017-03-01 18:11:56,980 [StreamThread-3]:
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> Successfully joined group new-part-advice with *generation 96*
>
> 53353 INFO 2017-03-01 18:11:56,980 [StreamThread-4]:
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> Successfully joined group new-part-advice with *generation 96*
>
> 53354 INFO 2017-03-01 18:11:56,980 [StreamThread-1]:
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> Successfully joined group new-part-advice with *generation 96*
>
>
> 53361 INFO 2017-03-01 *18:11:57,084* [StreamThread-1]:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread *[StreamThread-1]
> New partitions [[advice-stream-9, advice-strea**m-6]]* assigned at the end
> of consumer rebalance.
>
> 53362 INFO 2017-03-01 18:11:57,084 [StreamThread-3]:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread *[StreamThread-3]
> New partitions [[advice-stream-7]]* assigned at the end of consumer
> rebalance.
>
> 53363 DEBUG 2017-03-01 18:11:57,084 [StreamThread-3]:
> org.apache.kafka.streams.processor.internals.StreamThread - stream-thread
> [StreamThread-3] closing suspended non-assigned task
>
> 53364 INFO 2017-03-01 18:11:57,084 [StreamThread-4]:
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Setting
> newly assigned partitions [advice-stream-8, advice-stream-2] fo      r
> group new-part-advice
>
> 53365 INFO 2017-03-01 18:11:57,084 [StreamThread-4]:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread *[StreamThread-4]
> New partitions [[advice-stream-8, advice-strea**m-2]]* assigned at the end
> of consumer rebalance.
>
>
>
> 54253 INFO 2017-03-01 18:13:19,146 *[StreamThread-4]*:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread *[StreamThread-4]
> Creating active task 0_2 with assigned partiti**ons [[advice-stream-2]]*
>
>
> 56789 INFO 2017-03-01 18:17:42,311 [StreamThread-1]:
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> Successfully joined group new-part-advice with *generation 97*
>
> 56790 INFO 2017-03-01 18:17:42,311 [StreamThread-3]:
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> Successfully joined group new-part-advice with *generation 97*
>
>
> 56796 INFO 2017-03-01 *18:17:42,312* [StreamThread-3]:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread *[StreamThread-3]
> New partitions [[advice-stream-9, advice-strea**m-6]]* assigned at the end
> of consumer rebalance.
>
> 56805 INFO 2017-03-01 18:17:42,312 [StreamThread-1]:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread *[StreamThread-1]
> New partitions [[advice-stream-8, advice-strea**m-4]]* assigned at the end
> of consumer rebalance.
>
>
>
> 56823 INFO 2017-03-01 18:17:42,465 *[StreamThread-1]*:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread *[StreamThread-1]
> Creating active task 0_4 with assigned partiti**ons [[advice-stream-4]]*
>
> 56831 DEBUG 2017-03-01 18:17:42,531 *[StreamThread-1]*:
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Group
> new-part-advice *fetching committed offsets for partitions:
> [advi**ce-stream-8,
> advice-stream-4]*
>
>
>
> 57769 INFO 2017-03-01 *18:20:16,291* [StreamThread-4]:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread *[StreamThread-4]
> Creating active task 0_8 with assigned partiti**ons [[advice-stream-8]]*
>
>
>
>
> 57935 INFO 2017-03-01 18:20:19,181 [StreamThread-4]:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread *[StreamThread-4]
> Removing all active tasks [[0_2, 0_8]]*
>
>
>
> *======== TestKafkaAdvice.2017-03-01.90.log  ============*
>
>
>  742 INFO 2017-03-01 *18:22:42,697* [StreamThread-3]:
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> Successfully joined group new-part-advice with *generation 98*
>
>  743 INFO 2017-03-01 18:22:42,697 [StreamThread-4]:
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> Successfully joined group new-part-advice with *generation 98*
>
>
> 746 INFO 2017-03-01 18:22:42,697 [StreamThread-3]:
> org.apache.kafka.streams.
> processor.internals.StreamThread - stream-thread *[StreamThread-3] New
> partitions [[advice-stream-9, advice-stream**-4]]* assigned at the end of
> consumer rebalance.
>
> 748 INFO 2017-03-01 18:22:42,712 [StreamThread-4]:
> org.apache.kafka.streams.
> processor.internals.StreamThread - stream-thread *[StreamThread-4] New
> partitions [[advice-stream-8, advice-stream**-2]]* assigned at the end of
> consumer rebalance.
>
> ...............
>
> 4502 WARN 2017-03-01 *18:26:23,323* *[StreamThread-3]*:
> org.apache.kafka.streams.processor.internals.StreamThread - *Could not
> create task 0_4*. Will retry.
>
>
> ***************************************  Some time later.......*
>
>
> 4444 INFO 2017-03-01 18:25:44,071 [StreamThread-4]:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread *[StreamThread-4]
> Removing all active tasks [[0_2, 0_8]]*
>
>
> ************************************
> *Task 0_2 (from line # 54253) moved to a different instance's thread-3 and
> Task 0_8 got re-assigned to Thread-4 (line # 4584 below)*
>
>
> 4474 INFO 2017-03-01 18:26:13,443 [StreamThread-1]:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread [*StreamThread-1]
> Creating active task 0_8 with assigned partitio**ns [[advice-stream-8]]*
>
> 4475 WARN 2017-03-01 18:26:14,445 *[StreamThread-1]*:
> org.apache.kafka.streams.processor.internals.StreamThread - *Could not
> create task 0_8*. Will retry.
>
>
>
>
> 4581 INFO 2017-03-01 *18:27:43,048* [StreamThread-4]:
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> Successfully joined group new-part-advice with generation 99
>
>
> 4584 INFO 2017-03-01 18:27:43,048 [StreamThread-4]:
> org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread *[StreamThread-4]
> New partitions [[advice-stream-8, advice-stream**-4]]* assigned at the end
> of consumer rebalance.
>
>
> *************************************  Task Task 0_4 lock's as per line
> #56823 above still seems to be with Thread-1 !!!*
>
>
> 4625 WARN 2017-03-01 *18:27:44,092* *[StreamThread-4]*:
> org.apache.kafka.streams.processor.internals.StreamThread - *Could not
> create task 0_4.* Will retry.
>
> 4646 WARN 2017-03-01 18:27:45,145 *[StreamThread-4]*:
> org.apache.kafka.streams.processor.internals.StreamThread - Could not
> create task 0_4. Will retry.
>
> 4955 WARN 2017-03-01 18:29:37,508 *[StreamThread-4]*:
> org.apache.kafka.streams.processor.internals.StreamThread - Could not
> create task 0_4. Will retry.
>
> 4982 WARN 2017-03-01 18:29:49,125 *[StreamThread-3]*:
> org.apache.kafka.streams.processor.internals.StreamThread - Could not
> create task 0_4. Will retry.
>
> 5073 WARN 2017-03-01 18:31:20,911 *[StreamThread-4]*:
> org.apache.kafka.streams.processor.internals.StreamThread - Could not
> create task 0_4. Will retry.
>
> 5173 WARN 2017-03-01 18:34:46,714 *[StreamThread-4]*:
> org.apache.kafka.streams.processor.internals.StreamThread - Could not
> create task 0_4. Will retry.
>
> 5199 WARN 2017-03-01 18:36:39,729 *[StreamThread-3]*:
> org.apache.kafka.streams.processor.internals.StreamThread - Could not
> create task 0_4. Will retry.
>
> 5248 WARN 2017-03-01 18:41:37,329 *[StreamThread-4]*:
> org.apache.kafka.streams.processor.internals.StreamThread - Could not
> create task 0_4. Will retry.
>
> 5279 WARN 2017-03-01 18:50:19,977 *[StreamThread-3]*:
> org.apache.kafka.streams.processor.internals.StreamThread - Could not
> create task 0_4. Will retry.
>
>
> 5397 WARN 2017-03-01 *19:19:13,900* *[StreamThread-1]*:
> org.apache.kafka.streams.processor.internals.StreamThread - *Could not
> create task 0_8*. Will retry.
>
> 5721 WARN 2017-03-01 *22:03:07,629 [StreamThread-1]*:
> org.apache.kafka.streams.processor.internals.StreamThread - *Could not
> create task 0_8*. Will retry.
>
>
> 5698 WARN 2017-03-01 *22:01:32,017* *[StreamThread-3]*:
> org.apache.kafka.streams.processor.internals.StreamThread - Could not
> create task 0_4. Will retry.
>
> 5746 WARN 2017-03-01 *22:06:29,340* *[StreamThread-4]*:
> org.apache.kafka.streams.processor.internals.StreamThread - *Could not
> create task 0_4.* Will retry.
>
>
> *======== TestKafkaAdvice.log  ============*
>
>
> 148 WARN 2017-03-02 *01:41:35,831* *[StreamThread-1]*:
> org.apache.kafka.streams.processor.internals.StreamThread - *Could not
> create task 0_8.* Will retry.
>
> 735 WARN 2017-03-02 *08:58:31,234* *[StreamThread-1]*:
> org.apache.kafka.streams.processor.internals.StreamThread - *Could not
> create task 0_8*. Will retry.
>
>
> 712 WARN 2017-03-02 *08:56:55,810 [StreamThread-3]*:
> org.apache.kafka.streams.processor.internals.StreamThread - Could not
> create task 0_4. Will retry.
>
> 760 WARN 2017-03-02 *09:01:52,945* *[StreamThread-4]*:
> org.apache.kafka.streams.processor.internals.StreamThread - *Could not
> create task 0_4*. Will retry.
>
>
> ***********************************    Finally all threads reach the
> deadlock state where they cannot create task*
>
>
> Please let us know, why we get such a deadlock state.
>
>
> Thanks
>
> Sachin
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message