kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sa Li <sal...@gmail.com>
Subject Re: NotLeaderForPartitionException while doing performance test
Date Thu, 08 Jan 2015 06:10:30 GMT
Yes, it is weird hostname, ;), that is what our system guys name it. How to
take a note to measure the connections open to 10.100.98.102?

Thanks

AL
On Jan 7, 2015 9:42 PM, "Jaikiran Pai" <jai.forums2013@gmail.com> wrote:

> On Thursday 08 January 2015 01:51 AM, Sa Li wrote:
>
>> see this type of error again, back to normal in few secs
>>
>> [2015-01-07 20:19:49,744] WARN Error in I/O with harmful-jar.master/
>> 10.100.98.102
>>
>
> That's a really weird hostname, the "harmful-jar.master". Is that really
> your hostname? You mention that this happens during performance testing.
> Have you taken a note of how many connection are open to that 10.100.98.102
> IP when this "Connection refused" exception happens?
>
> -Jaikiran
>
>
>    (org.apache.kafka.common.network.Selector)
>> java.net.ConnectException: Connection refused
>>          at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>          at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>          at org.apache.kafka.common.network.Selector.poll(
>> Selector.java:232)
>>          at
>> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191)
>>          at
>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184)
>>          at
>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
>>          at java.lang.Thread.run(Thread.java:745)
>> [2015-01-07 20:19:49,754] WARN Error in I/O with harmful-jar.master/
>> 10.100.98.102 (org.apache.kafka.common.network.Selector)
>> java.net.ConnectException: Connection refused
>>          at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>          at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>          at org.apache.kafka.common.network.Selector.poll(
>> Selector.java:232)
>>          at
>> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191)
>>          at
>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184)
>>          at
>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
>>          at java.lang.Thread.run(Thread.java:745)
>> [2015-01-07 20:19:49,764] WARN Error in I/O with harmful-jar.master/
>> 10.100.98.102 (org.apache.kafka.common.network.Selector)
>> java.net.ConnectException: Connection refused
>>          at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>          at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>          at org.apache.kafka.common.network.Selector.poll(
>> Selector.java:232)
>>          at
>> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:191)
>>          at
>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:184)
>>          at
>> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
>>          at java.lang.Thread.run(Thread.java:745)
>> 160403 records sent, 32080.6 records/sec (91.78 MB/sec), 507.0 ms avg
>> latency, 2418.0 max latency.
>> 109882 records sent, 21976.4 records/sec (62.87 MB/sec), 672.7 ms avg
>> latency, 3529.0 max latency.
>> 100315 records sent, 19995.0 records/sec (57.21 MB/sec), 774.8 ms avg
>> latency, 3858.0 max latency.
>>
>> On Wed, Jan 7, 2015 at 12:07 PM, Sa Li <salicn@gmail.com> wrote:
>>
>>  Hi, All
>>>
>>> I am doing performance test by
>>>
>>> bin/kafka-run-class.sh org.apache.kafka.clients.
>>> tools.ProducerPerformance
>>> test-rep-three 500000000 100 -1 acks=1 bootstrap.servers=
>>> 10.100.98.100:9092,10.100.98.101:9092,10.100.98.102:9092
>>> buffer.memory=67108864 batch.size=8196
>>>
>>> where the topic test-rep-three is described as follow:
>>>
>>> bin/kafka-topics.sh --describe --zookeeper 10.100.98.101:2181 --topic
>>> test-rep-three
>>> Topic:test-rep-three    PartitionCount:8        ReplicationFactor:3
>>> Configs:
>>>          Topic: test-rep-three   Partition: 0    Leader: 100
>>>  Replicas:
>>> 100,102,101   Isr: 102,101,100
>>>          Topic: test-rep-three   Partition: 1    Leader: 101
>>>  Replicas:
>>> 101,100,102   Isr: 102,101,100
>>>          Topic: test-rep-three   Partition: 2    Leader: 102
>>>  Replicas:
>>> 102,101,100   Isr: 101,102,100
>>>          Topic: test-rep-three   Partition: 3    Leader: 100
>>>  Replicas:
>>> 100,101,102   Isr: 101,100,102
>>>          Topic: test-rep-three   Partition: 4    Leader: 101
>>>  Replicas:
>>> 101,102,100   Isr: 102,100,101
>>>          Topic: test-rep-three   Partition: 5    Leader: 102
>>>  Replicas:
>>> 102,100,101   Isr: 100,102,101
>>>          Topic: test-rep-three   Partition: 6    Leader: 102
>>>  Replicas:
>>> 100,102,101   Isr: 102,101,100
>>>          Topic: test-rep-three   Partition: 7    Leader: 101
>>>  Replicas:
>>> 101,100,102   Isr: 101,100,102
>>>
>>> Apparently, it produces the messages and run for a while, but it
>>> periodically have such exceptions:
>>>
>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>> server
>>> is not the leader for that topic-partition.
>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>> server
>>> is not the leader for that topic-partition.
>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>> server
>>> is not the leader for that topic-partition.
>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>> server
>>> is not the leader for that topic-partition.
>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>> server
>>> is not the leader for that topic-partition.
>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>> server
>>> is not the leader for that topic-partition.
>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>> server
>>> is not the leader for that topic-partition.
>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>> server
>>> is not the leader for that topic-partition.
>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>> server
>>> is not the leader for that topic-partition.
>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>> server
>>> is not the leader for that topic-partition.
>>> org.apache.kafka.common.errors.NotLeaderForPartitionException: This
>>> server
>>> is not the leader for that topic-partition.
>>> 141292 records sent, 28258.4 records/sec (80.85 MB/sec), 551.2 ms avg
>>> latency, 1494.0 max latency.
>>> 142526 records sent, 28505.2 records/sec (81.55 MB/sec), 580.8 ms avg
>>> latency, 1513.0 max latency.
>>> 146564 records sent, 29312.8 records/sec (83.86 MB/sec), 557.9 ms avg
>>> latency, 1431.0 max latency.
>>> 146755 records sent, 29351.0 records/sec (83.97 MB/sec), 556.7 ms avg
>>> latency, 1480.0 max latency.
>>> 147963 records sent, 29592.6 records/sec (84.67 MB/sec), 556.7 ms avg
>>> latency, 1546.0 max latency.
>>> 146931 records sent, 29386.2 records/sec (84.07 MB/sec), 550.9 ms avg
>>> latency, 1715.0 max latency.
>>> 146947 records sent, 29389.4 records/sec (84.08 MB/sec), 555.1 ms avg
>>> latency, 1750.0 max latency.
>>> 146422 records sent, 29284.4 records/sec (83.78 MB/sec), 557.9 ms avg
>>> latency, 1818.0 max latency.
>>> 147516 records sent, 29503.2 records/sec (84.41 MB/sec), 555.6 ms avg
>>> latency, 1806.0 max latency.
>>> 147877 records sent, 29575.4 records/sec (84.62 MB/sec), 552.1 ms avg
>>> latency, 1821.0 max latency.
>>> 147201 records sent, 29440.2 records/sec (84.23 MB/sec), 554.5 ms avg
>>> latency, 1826.0 max latency.
>>> 148317 records sent, 29663.4 records/sec (84.87 MB/sec), 558.1 ms avg
>>> latency, 1792.0 max latency.
>>> 147756 records sent, 29551.2 records/sec (84.55 MB/sec), 550.9 ms avg
>>> latency, 1806.0 max latency
>>>
>>> then back into correct process state, is that because rebalance?
>>>
>>> thanks
>>>
>>>
>>>
>>> --
>>>
>>> Alec Li
>>>
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message