kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sa Li <sal...@gmail.com>
Subject Re: NotLeaderForPartitionException while doing performance test
Date Thu, 08 Jan 2015 19:13:52 GMT
Thanks, Jaikiran

I was trying to duplicate the same issue by running the same performance
test on master node of cluster , say exemplary-birds.master, and I did see
such error again
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server
is not the leader for that topic-partition.

At the same time, I did "lsof -I" here is the screenshot

java       6991   root   28u  IPv6 4334833      0t0  TCP *:50320 (LISTEN)
java       6991   root   29u  IPv6 4351835      0t0  TCP
exemplary-birds.master:9092->exemplary-birds.master:50472 (ESTABLISHED)
java       6991   root   38u  IPv6 4366588      0t0  TCP
exemplary-birds.master:59536->complicated-laugh.master:2181 (ESTABLISHED)
java       6991   root  150u  IPv6 4361502      0t0  TCP *:9092 (LISTEN)
java       6991   root  151u  IPv6 4368439      0t0  TCP
exemplary-birds.master:9092->harmful-jar.master:51131 (ESTABLISHED)
java       6991   root  154u  IPv6 4365924      0t0  TCP
exemplary-birds.master:55248->voluminous-mass.master:9092 (ESTABLISHED)
java       6991   root  155u  IPv6 4366591      0t0  TCP
exemplary-birds.master:55245->voluminous-mass.master:9092 (ESTABLISHED)
java       6991   root  157u  IPv6 4365923      0t0  TCP
exemplary-birds.master:41946->harmful-jar.master:9092 (ESTABLISHED)
java       6991   root  176u  IPv6 4358833      0t0  TCP
exemplary-birds.master:55251->voluminous-mass.master:9092 (ESTABLISHED)
java       6991   root  179u  IPv6 4292501      0t0  TCP
exemplary-birds.master:41951->harmful-jar.master:9092 (ESTABLISHED)
java       6991   root  180u  IPv6 4338331      0t0  TCP
exemplary-birds.master:9092->harmful-jar.master:51133 (ESTABLISHED)
java       6991   root  181u  IPv6 4364530      0t0  TCP
exemplary-birds.master:9092->voluminous-mass.master:42897 (ESTABLISHED)
java       6991   root  182u  IPv6 4358834      0t0  TCP
exemplary-birds.master:9092->harmful-jar.master:51134 (ESTABLISHED)
java       6991   root  183u  IPv6 4354353      0t0  TCP
exemplary-birds.master:9092->voluminous-mass.master:42898 (ESTABLISHED)
java       6991   root  190u  IPv6 4351836      0t0  TCP
exemplary-birds.master:9092->localhost:40786 (ESTABLISHED)
java       6991   root  201u  IPv6 4364543      0t0  TCP
exemplary-birds.master:9092->harmful-jar.master:51135 (ESTABLISHED)
java       6991   root  202u  IPv6 4364544      0t0  TCP
exemplary-birds.master:9092->voluminous-mass.master:42899 (ESTABLISHED)
java       7218   root   44u  IPv6 4366240      0t0  TCP *:46256 (LISTEN)
java       7218   root   48u  IPv6 4366602      0t0  TCP
exemplary-birds.master:50472->exemplary-birds.master:9092 (ESTABLISHED)
java       7218   root   50u  IPv6 4350446      0t0  TCP
exemplary-birds.master:41960->harmful-jar.master:9092 (ESTABLISHED)
java       7218   root   51u  IPv6 4350447      0t0  TCP
localhost:40786->exemplary-birds.master:9092 (ESTABLISHED)
java       7218   root   52u  IPv6 4350448      0t0  TCP
exemplary-birds.master:55263->voluminous-mass.master:9092 (ESTABLISHED)
java      17582   root   44u  IPv6 4326187      0t0  TCP *:46316 (LISTEN)
ntpd      18649    ntp   16u  IPv4  656334      0t0  UDP *:ntp
ntpd      18649    ntp   17u  IPv6  656335      0t0  UDP *:ntp
ntpd      18649    ntp   18u  IPv4  656341      0t0  UDP localhost:ntp
ntpd      18649    ntp   19u  IPv4  656342      0t0  UDP
exemplary-birds.master:ntp
ntpd      18649    ntp   20u  IPv6  656343      0t0  UDP localhost:ntp
ntpd      18649    ntp   21u  IPv6  656344      0t0  UDP
[fe80::7a2b:cbff:fe1f:2e77]:ntp
sshd      21995   root    3u  IPv4 4277546      0t0  TCP
exemplary-birds.master:ssh->10.100.68.15:60642 (ESTABLISHED)
sshd      22091 fitsum    3u  IPv4 4277546      0t0  TCP
exemplary-birds.master:ssh->10.100.68.15:60642 (ESTABLISHED)
java      22152   root   21u  IPv6  213140      0t0  TCP *:52411 (LISTEN)
java      22152   root   26u  IPv6  213145      0t0  TCP *:2181 (LISTEN)
java      22152   root   27u  IPv6  211541      0t0  TCP
exemplary-birds.master:3888 (LISTEN)
java      22152   root   28u  IPv6  443527      0t0  TCP
exemplary-birds.master:3888->complicated-laugh.master:43940 (ESTABLISHED)
java      22152   root   29u  IPv6   23347      0t0  TCP
exemplary-birds.master:43797->harmful-jar.master:2888 (ESTABLISHED)
java      22152   root   30u  IPv6  204517      0t0  TCP
exemplary-birds.master:3888->harmful-jar.master:50791 (ESTABLISHED)
java      22152   root   31u  IPv6 4278513      0t0  TCP
exemplary-birds.master:3888->voluminous-mass.master:50452 (ESTABLISHED)
java      22152   root   32u  IPv6 4345845      0t0  TCP
exemplary-birds.master:2181->harmful-jar.master:45048 (ESTABLISHED)
java      22152   root   33u  IPv6  443552      0t0  TCP
exemplary-birds.master:3888->beloved-judge.master:56370 (ESTABLISHED)
java      22152   root   35u  IPv6 4364514      0t0  TCP
exemplary-birds.master:2181->voluminous-mass.master:60600 (ESTABLISHED)
ssh       24632     sa    3u  IPv4 4289852      0t0  TCP
exemplary-birds.master:60510->harmful-jar.master:ssh (ESTABLISHED)
ssh       24645     sa    3u  IPv4 4289867      0t0  TCP
exemplary-birds.master:33295->voluminous-mass.master:ssh (ESTABLISHED)

I didn't see anything wrong with it, but seem, the connection was
temporally closed...... Anyone has similar issue?

thanks







On Wed, Jan 7, 2015 at 10:32 PM, Jaikiran Pai <jai.forums2013@gmail.com>
wrote:

> There are different ways to find the connection count and each one depends
> on the operating system that's being used. "lsof -i" is one option, for
> example, on *nix systems.
>
> -Jaikiran
>
> On Thursday 08 January 2015 11:40 AM, Sa Li wrote:
>
>> Yes, it is weird hostname, ;), that is what our system guys name it. How
>> to
>> take a note to measure the connections open to 10.100.98.102?
>>
>> Thanks
>>
>> AL
>> On Jan 7, 2015 9:42 PM, "Jaikiran Pai" <jai.forums2013@gmail.com> wrote:
>>
>>  On Thursday 08 January 2015 01:51 AM, Sa Li wrote:
>>>
>>>  see this type of error again, back to normal in few secs
>>>>
>>>> [2015-01-07 20:19:49,744] WARN Error in I/O with harmful-jar.master/
>>>> 10.100.98.102
>>>>
>>>>  That's a really weird hostname, the "harmful-jar.master". Is that
>>> really
>>> your hostname? You mention that this happens during performance testing.
>>> Have you taken a note of how many connection are open to that
>>> 10.100.98.102
>>> IP when this "Connection refused" exception happens?
>>>
>>> -Jaikiran
>>>
>>>
>>>     (org.apache.kafka.common.network.Selector)
>>>
>>>> java.net.ConnectException: Connection refused
>>>>           at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>           at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>>>           at org.apache.kafka.common.network.Selector.poll(
>>>> Selector.java:232)
>>>>           at
>>>> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191)
>>>>           at
>>>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184)
>>>>           at
>>>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
>>>>           at java.lang.Thread.run(Thread.java:745)
>>>> [2015-01-07 20:19:49,754] WARN Error in I/O with harmful-jar.master/
>>>> 10.100.98.102 (org.apache.kafka.common.network.Selector)
>>>> java.net.ConnectException: Connection refused
>>>>           at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>           at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>>>           at org.apache.kafka.common.network.Selector.poll(
>>>> Selector.java:232)
>>>>           at
>>>> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191)
>>>>           at
>>>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184)
>>>>           at
>>>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
>>>>           at java.lang.Thread.run(Thread.java:745)
>>>> [2015-01-07 20:19:49,764] WARN Error in I/O with harmful-jar.master/
>>>> 10.100.98.102 (org.apache.kafka.common.network.Selector)
>>>> java.net.ConnectException: Connection refused
>>>>           at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>           at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>>>           at org.apache.kafka.common.network.Selector.poll(
>>>> Selector.java:232)
>>>>           at
>>>> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191)
>>>>           at
>>>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184)
>>>>           at
>>>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
>>>>           at java.lang.Thread.run(Thread.java:745)
>>>> 160403 records sent, 32080.6 records/sec (91.78 MB/sec), 507.0 ms avg
>>>> latency, 2418.0 max latency.
>>>> 109882 records sent, 21976.4 records/sec (62.87 MB/sec), 672.7 ms avg
>>>> latency, 3529.0 max latency.
>>>> 100315 records sent, 19995.0 records/sec (57.21 MB/sec), 774.8 ms avg
>>>> latency, 3858.0 max latency.
>>>>
>>>> On Wed, Jan 7, 2015 at 12:07 PM, Sa Li <salicn@gmail.com> wrote:
>>>>
>>>>   Hi, All
>>>>
>>>>> I am doing performance test by
>>>>>
>>>>> bin/kafka-run-class.sh org.apache.kafka.clients.
>>>>> tools.ProducerPerformance
>>>>> test-rep-three 500000000 100 -1 acks=1 bootstrap.servers=
>>>>> 10.100.98.100:9092,10.100.98.101:9092,10.100.98.102:9092
>>>>> buffer.memory=67108864 batch.size=8196
>>>>>
>>>>> where the topic test-rep-three is described as follow:
>>>>>
>>>>> bin/kafka-topics.sh --describe --zookeeper 10.100.98.101:2181 --topic
>>>>> test-rep-three
>>>>> Topic:test-rep-three    PartitionCount:8        ReplicationFactor:3
>>>>> Configs:
>>>>>           Topic: test-rep-three   Partition: 0    Leader: 100
>>>>>   Replicas:
>>>>> 100,102,101   Isr: 102,101,100
>>>>>           Topic: test-rep-three   Partition: 1    Leader: 101
>>>>>   Replicas:
>>>>> 101,100,102   Isr: 102,101,100
>>>>>           Topic: test-rep-three   Partition: 2    Leader: 102
>>>>>   Replicas:
>>>>> 102,101,100   Isr: 101,102,100
>>>>>           Topic: test-rep-three   Partition: 3    Leader: 100
>>>>>   Replicas:
>>>>> 100,101,102   Isr: 101,100,102
>>>>>           Topic: test-rep-three   Partition: 4    Leader: 101
>>>>>   Replicas:
>>>>> 101,102,100   Isr: 102,100,101
>>>>>           Topic: test-rep-three   Partition: 5    Leader: 102
>>>>>   Replicas:
>>>>> 102,100,101   Isr: 100,102,101
>>>>>           Topic: test-rep-three   Partition: 6    Leader: 102
>>>>>   Replicas:
>>>>> 100,102,101   Isr: 102,101,100
>>>>>           Topic: test-rep-three   Partition: 7    Leader: 101
>>>>>   Replicas:
>>>>> 101,100,102   Isr: 101,100,102
>>>>>
>>>>> Apparently, it produces the messages and run for a while, but it
>>>>> periodically have such exceptions:
>>>>>
>>>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>>>> server
>>>>> is not the leader for that topic-partition.
>>>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>>>> server
>>>>> is not the leader for that topic-partition.
>>>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>>>> server
>>>>> is not the leader for that topic-partition.
>>>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>>>> server
>>>>> is not the leader for that topic-partition.
>>>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>>>> server
>>>>> is not the leader for that topic-partition.
>>>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>>>> server
>>>>> is not the leader for that topic-partition.
>>>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>>>> server
>>>>> is not the leader for that topic-partition.
>>>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>>>> server
>>>>> is not the leader for that topic-partition.
>>>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>>>> server
>>>>> is not the leader for that topic-partition.
>>>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>>>> server
>>>>> is not the leader for that topic-partition.
>>>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>>>> server
>>>>> is not the leader for that topic-partition.
>>>>> 141292 records sent, 28258.4 records/sec (80.85 MB/sec), 551.2 ms avg
>>>>> latency, 1494.0 max latency.
>>>>> 142526 records sent, 28505.2 records/sec (81.55 MB/sec), 580.8 ms avg
>>>>> latency, 1513.0 max latency.
>>>>> 146564 records sent, 29312.8 records/sec (83.86 MB/sec), 557.9 ms avg
>>>>> latency, 1431.0 max latency.
>>>>> 146755 records sent, 29351.0 records/sec (83.97 MB/sec), 556.7 ms avg
>>>>> latency, 1480.0 max latency.
>>>>> 147963 records sent, 29592.6 records/sec (84.67 MB/sec), 556.7 ms avg
>>>>> latency, 1546.0 max latency.
>>>>> 146931 records sent, 29386.2 records/sec (84.07 MB/sec), 550.9 ms avg
>>>>> latency, 1715.0 max latency.
>>>>> 146947 records sent, 29389.4 records/sec (84.08 MB/sec), 555.1 ms avg
>>>>> latency, 1750.0 max latency.
>>>>> 146422 records sent, 29284.4 records/sec (83.78 MB/sec), 557.9 ms avg
>>>>> latency, 1818.0 max latency.
>>>>> 147516 records sent, 29503.2 records/sec (84.41 MB/sec), 555.6 ms avg
>>>>> latency, 1806.0 max latency.
>>>>> 147877 records sent, 29575.4 records/sec (84.62 MB/sec), 552.1 ms avg
>>>>> latency, 1821.0 max latency.
>>>>> 147201 records sent, 29440.2 records/sec (84.23 MB/sec), 554.5 ms avg
>>>>> latency, 1826.0 max latency.
>>>>> 148317 records sent, 29663.4 records/sec (84.87 MB/sec), 558.1 ms avg
>>>>> latency, 1792.0 max latency.
>>>>> 147756 records sent, 29551.2 records/sec (84.55 MB/sec), 550.9 ms avg
>>>>> latency, 1806.0 max latency
>>>>>
>>>>> then back into correct process state, is that because rebalance?
>>>>>
>>>>> thanks
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Alec Li
>>>>>
>>>>>
>>>>>
>>>>
>


-- 

Alec Li

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message