kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avi Flax <avi.f...@parkassist.com>
Subject [Streams] App Instance with lots of concurrency not consuming from 2 specific topics out of 11 total
Date Thu, 15 Dec 2016 21:39:25 GMT
Hi all,

I apologize for flooding the list with questions lately. I guess I’m having a rough week.

I thought my app was finally running fine after Damian’s help on Monday, but it turns out
that it hasn’t been (successfully) consuming 2 of the topics it should be (out of 11 total).

I’ve been trying to debug this all day but I’m out of ideas, hence this message.

Some background on my streams app:

* Kafka Streams 0.10.0.1, Java 8, JRuby 9.1.5.0
* currently running a single instance
* consuming from 11 topics with a total of 130 partitions
* num.stream.threads is 130
* constructing a single KafkaStreams and KStreamBuilder
* calling KStreamBuilder.stream once for each topic

that last is per this message from Damian on Monday:

https://lists.apache.org/thread.html/727ed4e6fba9bf350e500e0d3d1087f868337d7abccad1a38d06500f@%3Cusers.kafka.apache.org%3E

The app is successfully consuming from 9 of the topics, but for some mysterious reason it
is _not_ consuming from 2 of the topics. Two specific topics, consistently.

I first noticed the problem when looking at my lag graph in Datadog, which is broken out by
topic — I noticed that certain topics seemed missing.

So I ran kafka-consumer-groups to get a closer look, and both topics show up in the list,
with all their partitions, but their CURRENT-OFFSET value is “unknown” for every thread/consumer.

I’ve been trying to figure out what’s different about these 2 topics, but so far I’ve
had no luck, I just can’t find any differences.

What I’ve tried so far:

* consuming from the topics with kafkacat → looks good

* scrutinizing my app’s logs for errors or warnings related to these topics → see nothing

* stopping the app, changing its config to consume from only these 2 topics → nothing, same
result

* running a different instance of the app with different IDs, consuming from 1 of the problematic
topics only → nothing, it just sits there

In that last case, I took a look at the threads with jconsole. All the StreamThreads are just
sitting there with this status:

State: RUNNABLE
Total blocked: 23  Total waited: 0

Stack trace: 
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
   - locked sun.nio.ch.Util$3@17e90d96
   - locked java.util.Collections$UnmodifiableSet@2a14b4cc
   - locked sun.nio.ch.EPollSelectorImpl@5ffc44a4
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
org.apache.kafka.common.network.Selector.select(Selector.java:454)
org.apache.kafka.common.network.Selector.poll(Selector.java:277)
org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
...

At this point I’m at the end of my rope — I’m out of ideas. I would very much appreciate
any suggestions for how to proceed. I’d be happy to supply logs files, etc.

Thank you!
Avi


————
Software Architect @ Park Assist
We’re hiring! http://tech.parkassist.com/jobs/
Mime
View raw message