kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srikanth <srikanth...@gmail.com>
Subject Re: Kafka Streams app error while rebalancing
Date Wed, 06 Dec 2017 11:18:19 GMT
Thanks Matthias. Please find my response inline.

On Wed, Dec 6, 2017 at 12:34 AM, Matthias J. Sax <matthias@confluent.io>
wrote:

> Hard to say.
>
> However, deleting state directories will not have any negative impact as
> you don't use stores. Thus, why do you not want to do this?
> <Sri> We are running this in a prod-like setup. I don't have SSH access to
> the box.

Its not ideal to request deleting these every now and then. It doesn't
> sound prod ready.
>


> Another workaround you can do, it to start four applications with 1
> thread each -- this would isolate the instances further and avoid the
> lock issue (you would need to run on different host, or on the same
> host, but with different state directory for the different instances.)
> <Sri> App has a large in-mem cache. I'd like more threads to use this
> cache to keep memory/cpu usage in balance.




> In general, I would recommend to upgrade you applications -- you don't
> need to upgrade the brokers for this. 1.0 versions has many additional
> bug fixes. This should resolve the issues you are facing.

<Sri> I haven't really tried bidirectional compatibility mode. Is there any
> caveat I need to be aware of for 1.0.x client and 0.10.2.1 brokers?

I probably should try this.



> Otherwise, it's hard to say without logs.
> <Sri> Unfortunately, that's all I had in the logs.  I know it isn't
> helpful. I've kept default logging to WARN except for few packages.

I assume any logging that indicates why app isn't making progress would be
> in WARN. I'll bump the level up for few days.



> Hope this helps.
>
>
> -Matthias
>
>
> On 12/5/17 9:35 AM, Srikanth wrote:
> > Hello,
> >
> > We noticed that a kafka streams app is stuck in rebalance state with
> below
> > error.
> > Two instance of the app were running fine until a rebalace was
> > triggered(possibly due to network issue).
> > Both app instance are running(no app restart)
> > App itself doesn't create/use state store. NUM_STREAM_THREADS_CONFIG=2
> >
> > I did see several tickets with similar errors that are marked as fixed.
> I'm
> > using version 0.10.2.1.
> >
> > 17/12/04 18:34:57 WARN StreamThread: Could not create task 0_2. Will
> retry.
> > org.apache.kafka.streams.errors.LockException: task [0_2] Failed to lock
> > the state directory for task 0_2
> >   at
> > org.apache.kafka.streams.processor.internals.
> ProcessorStateManager.<init>(ProcessorStateManager.java:100)
> >   at
> > org.apache.kafka.streams.processor.internals.AbstractTask.<init>(
> AbstractTask.java:73)
> >   at
> > org.apache.kafka.streams.processor.internals.
> StreamTask.<init>(StreamTask.java:108)
> >   at
> > org.apache.kafka.streams.processor.internals.
> StreamThread.createStreamTask(StreamThread.java:864)
> >   at
> > org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.
> createTask(StreamThread.java:1237)
> >   at
> > org.apache.kafka.streams.processor.internals.StreamThread$
> AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210)
> >   at
> > org.apache.kafka.streams.processor.internals.
> StreamThread.addStreamTasks(StreamThread.java:967)
> >   at
> > org.apache.kafka.streams.processor.internals.StreamThread.access$600(
> StreamThread.java:69)
> >   at
> > org.apache.kafka.streams.processor.internals.StreamThread$1.
> onPartitionsAssigned(StreamThread.java:234)
> >   at
> > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.
> onJoinComplete(ConsumerCoordinator.java:259)
> >   at
> > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.
> joinGroupIfNeeded(AbstractCoordinator.java:352)
> >   at
> > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.
> ensureActiveGroup(AbstractCoordinator.java:303)
> >   at
> > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(
> ConsumerCoordinator.java:290)
> >   at
> > org.apache.kafka.clients.consumer.KafkaConsumer.
> pollOnce(KafkaConsumer.java:1029)
> >   at
> > org.apache.kafka.clients.consumer.KafkaConsumer.poll(
> KafkaConsumer.java:995)
> >   at
> > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(
> StreamThread.java:592)
> >   at
> > org.apache.kafka.streams.processor.internals.
> StreamThread.run(StreamThread.java:361)
> > 17/12/04 18:34:58 WARN StreamThread: Could not create task 0_8. Will
> retry.
> > org.apache.kafka.streams.errors.LockException: task [0_8] Failed to lock
> > the state directory for task 0_8
> >
> >
> > kafka-consumer-groups --new-consumer --bootstrap-server <...> --describe
> > --group GeoTest
> > Note: This will only show information about consumers that use the Java
> > consumer API (non-ZooKeeper-based consumers).
> >
> > Warning: Consumer group 'GeoTest' is rebalancing.
> >
> > I keep seeing the above lock exception continuously and app is not making
> > any progress. Any idea why it is stuck?
> > I read a few suggestions that required me to manually delete state
> > directory. I'd like to avoid that.
> >
> > Thanks,
> > Srikanth
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message