Hi,

 

I want to report an issue where in addition of a server at runtime in my streams compute cluster caused errors and subsequent complete halting of the cluster. I am not sure if this is the actual issue, but this was something I did differently while 18 hour smooth run of the streams app.

 

Initially, I had one machine working on my Kafka topic, which contains impressions and clicks. The job was running overnight, in the morning I just added another machine to the cluster and this is when every time stuck after working fine for some time.  

 

Please find the kafka_log_snippet and poc_log_snippet attached.

 

Thereafter, failing of these nodes, I tried to restart just one machine on my compute cluster to see if it can initialize itself.

Please the logs attached for the same as well. Following were the logs I saw quite often.

 

2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for LIC2-4-licountci-4-changelog-38 at offset 556717 since the current position is 557065

2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions [LIC2-4-licountci-4-changelog-38] to broker 172.29.65.190:9092 (id: 0 rack: null)

2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for LIC2-4-licountci-4-changelog-48 at offset 607657 since the current position is 607880

2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions [LIC2-4-licountci-4-changelog-48] to broker 172.29.65.192:9092 (id: 2 rack: null)

2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for LIC2-4-licountci-4-changelog-31 at offset 282265 since the current position is 282327

2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions [LIC2-4-licountci-4-changelog-31] to broker 172.29.65.191:9092 (id: 1 rack: null)

2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for LIC2-4-licountci-4-changelog-3 at offset 499952 since the current position is 500324

2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions [LIC2-4-licountci-4-changelog-3] to broker 172.29.65.192:9092 (id: 2 rack: null)

2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for LIC2-4-licountci-4-changelog-21 at offset 587018 since the current position is 587227

2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions [LIC2-4-licountci-4-changelog-21] to broker 172.29.65.192:9092 (id: 2 rack: null)

2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for LIC2-4-licountci-4-changelog-49 at offset 276209 since the current position is 276271

2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions [LIC2-4-licountci-4-changelog-49] to broker 172.29.65.191:9092 (id: 1 rack: null)

2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for LIC2-4-licountci-4-changelog-16 at offset 592727 since the current position is 592896

2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions [LIC2-4-licountci-4-changelog-16] to broker 172.29.65.191:9092 (id: 1 rack: null)

2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for LIC2-4-licountci-4-changelog-37 at offset 458224 since the current position is 458343

2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions [LIC2-4-licountci-4-changelog-37] to broker 172.29.65.191:9092 (id: 1 rack: null)

2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for LIC2-4-licountci-4-changelog-59 at offset 495722 since the current position is 496113

2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions [LIC2-4-licountci-4-changelog-59] to broker 172.29.65.190:9092 (id: 0 rack: null)

2017-05-03 14:15:53 DEBUG Fetcher:526 - Ignoring fetched records for LIC2-4-licountci-4-changelog-35 at offset 230310 since the current position is 231236

2017-05-03 14:15:53 DEBUG Fetcher:180 - Sending fetch for partitions [LIC2-4-licountci-4-changelog-35] to broker 172.29.65.190:9092 (id: 0 rack: null)

 

Regards,

-Sameer.