kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Koshy <jjkosh...@gmail.com>
Subject Re: Issue with topic deletion
Date Fri, 06 Feb 2015 02:58:23 GMT
Thanks will take a look..

On Wed, Feb 04, 2015 at 11:33:03PM -0800, Sumit Rangwala wrote:
> >
> > Any idea why you have session expirations? This is typically due to GC
> >> and/or flaky network. Regardless, we should be handling that scenario
> >> as well. However, your logs seem incomplete. Can you redo this and
> >> perhaps keep the set up running a little longer and send over those
> >> logs?
> >>
> >>
> > I am stress testing my application by doing a large number of read and
> > writes to kafka. My setup consist many docker instances (of brokers and
> > client) running (intentionally) on a single linux box. Since the machine is
> > overload, congested network and long GC are a possibility.
> >
> > I will redo the experiment and keep the kakfa brokers running. However, I
> > will move to 0.8.2 release since Jun asked me to try it for another issue
> > (topic creation). I hope that is fine.
> >
> >
> Joel, Harsha,
> 
> Here are the logs from a different run ( http://d.pr/f/1giCl/PA6GBhrU ).
> Everything is same except that I moved from 0.8.2-beta to 0.8.2.0 (for
> clients as well as brokers). My setup consist of 1 zookeeper, 3 kafka
> brokers, 5 my application nodes. I shutdown all my application nodes after
> the experiment but zookeeper and kafka brokers are still running in case we
> need more data. All the dump of log files on brokers were taken *after* I
> shut down all application nodes to make sure there were no consumer for any
> of the topics.
> 
> The current state of topics on kafka nodes where the topic with deletion
> issue is LAX1-GRIFFIN-r25-1423118905870
> 
> root@69a87c0e7c5b:/# date && $KAFKA_HOME/bin/kafka-topics.sh
> --zookeeper=$ZK_PORT_2181_TCP_ADDR:$ZK_PORT_2181_TCP_PORT --list
> 
> Thu Feb  5 07:15:28 UTC 2015
> LAX1-GRIFFIN-CONTROL
> LAX1-GRIFFIN-r1-1423119435923
> LAX1-GRIFFIN-r10-1423119436039
> LAX1-GRIFFIN-r11-1423119436675
> LAX1-GRIFFIN-r12-1423119436320
> LAX1-GRIFFIN-r13-1423119436162
> LAX1-GRIFFIN-r14-1423119436583
> LAX1-GRIFFIN-r15-1423119436446
> LAX1-GRIFFIN-r16-1423119437195
> LAX1-GRIFFIN-r17-1423119436330
> LAX1-GRIFFIN-r18-1423119436411
> LAX1-GRIFFIN-r19-1423119437150
> LAX1-GRIFFIN-r2-1423119435925
> LAX1-GRIFFIN-r20-1423119436750
> LAX1-GRIFFIN-r21-1423119437717
> LAX1-GRIFFIN-r22-1423119436820
> LAX1-GRIFFIN-r23-1423119436483
> LAX1-GRIFFIN-r24-1423119437210
> LAX1-GRIFFIN-r25-1423118905870 - marked for deletion
> LAX1-GRIFFIN-r25-1423119437277
> LAX1-GRIFFIN-r26-1423119438650
> LAX1-GRIFFIN-r27-1423119436855
> LAX1-GRIFFIN-r28-1423119436566
> LAX1-GRIFFIN-r29-1423119437955
> LAX1-GRIFFIN-r3-1423119435925
> LAX1-GRIFFIN-r30-1423119437949
> LAX1-GRIFFIN-r4-1423119435925
> LAX1-GRIFFIN-r5-1423119435923
> LAX1-GRIFFIN-r6-1423119436354
> LAX1-GRIFFIN-r7-1423119436095
> LAX1-GRIFFIN-r8-1423119436100
> LAX1-GRIFFIN-r9-1423119436377
> 
> Sumit
> 
> 
> 
> 
> >
> > Sumit
> >
> >
> >
> >> Thanks,
> >>
> >> Joel
> >>
> >> On Wed, Feb 04, 2015 at 01:00:46PM -0800, Sumit Rangwala wrote:
> >> > >
> >> > >
> >> > >> I have since stopped the container so I cannot say if
> >> > > LAX1-GRIFFIN-r45-1423000088317 was one of the topic in "marked for
> >> > > deletion" forever.  However, there were many topics (at least 10 of
> >> them)
> >> > > that were perennially in "marked for deletion" state.
> >> > >
> >> > >
> >> > I have the setup to recreate the issue in case the logs are not
> >> sufficient.
> >> >
> >> >
> >> > Sumit
> >> >
> >> >
> >> >
> >> > > Sumit
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >> -Harsha
> >> > >>
> >> > >> On Tue, Feb 3, 2015, at 09:19 PM, Harsha wrote:
> >> > >> > you are probably handling it but there is a case where you
call
> >> > >> > deleteTopic and kafka goes through delete topic process but
your
> >> > >> > consumer is running probably made a TopicMetadataRequest
for the
> >> same
> >> > >> > topic which can re-create the topic with the default
> >> num.partitions and
> >> > >> > replication.factor.  Did you try stopping the consumer first
and
> >> issue
> >> > >> > the topic delete.
> >> > >> > -Harsha
> >> > >> >
> >> > >> > On Tue, Feb 3, 2015, at 08:37 PM, Sumit Rangwala wrote:
> >> > >> > > On Tue, Feb 3, 2015 at 6:48 PM, Harsha <kafka@harsha.io>
wrote:
> >> > >> > >
> >> > >> > > > Sumit,
> >> > >> > > >        lets say you are deleting a older topic
"test1" do you
> >> have
> >> > >> any
> >> > >> > > >        consumers running simultaneously for the
topic "test1"
> >> while
> >> > >> > > >        deletion of topic going on.
> >> > >> > > >
> >> > >> > >
> >> > >> > > Yes it is the case. However, after a small period of
time (say
> >> few
> >> > >> > > minutes)
> >> > >> > > there won't be any consumer running for the deleted
topic.
> >> > >> > >
> >> > >> > >
> >> > >> > > Sumit
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > > > -Harsha
> >> > >> > > >
> >> > >> > > > On Tue, Feb 3, 2015, at 06:17 PM, Joel Koshy wrote:
> >> > >> > > > > Thanks for the logs - will take a look tomorrow
unless
> >> someone
> >> > >> else
> >> > >> > > > > gets a chance to get to it today.
> >> > >> > > > >
> >> > >> > > > > Joel
> >> > >> > > > >
> >> > >> > > > > On Tue, Feb 03, 2015 at 04:11:57PM -0800,
Sumit Rangwala
> >> wrote:
> >> > >> > > > > > On Tue, Feb 3, 2015 at 3:37 PM, Joel
Koshy <
> >> jjkoshy.w@gmail.com
> >> > >> >
> >> > >> > > > wrote:
> >> > >> > > > > >
> >> > >> > > > > > > Hey Sumit,
> >> > >> > > > > > >
> >> > >> > > > > > > I thought you would be providing
the actual steps to
> >> > >> reproduce :)
> >> > >> > > > > > >
> >> > >> > > > > >
> >> > >> > > > > > I want to but some proprietary code prevents
me to do it.
> >> > >> > > > > >
> >> > >> > > > > >
> >> > >> > > > > > > Nevertheless, can you get all the
relevant logs: state
> >> change
> >> > >> logs
> >> > >> > > > and
> >> > >> > > > > > > controller logs at the very least
and if possible server
> >> logs
> >> > >> and
> >> > >> > > > send
> >> > >> > > > > > > those over?
> >> > >> > > > > > >
> >> > >> > > > > >
> >> > >> > > > > > Here are all the logs you requested (there
are three
> >> brokers in
> >> > >> my
> >> > >> > > > setup
> >> > >> > > > > > k1, k2, k3): http://d.pr/f/1kprY/2quHBRRT
(Gmail has issue
> >> > >> with the
> >> > >> > > > file)
> >> > >> > > > > >
> >> > >> > > > > >
> >> > >> > > > > > Sumit
> >> > >> > > > > >
> >> > >> > > > > >
> >> > >> > > > > >
> >> > >> > > > > >
> >> > >> > > > > > >
> >> > >> > > > > > > Joel
> >> > >> > > > > > >
> >> > >> > > > > > > On Tue, Feb 03, 2015 at 03:27:43PM
-0800, Sumit Rangwala
> >> > >> wrote:
> >> > >> > > > > > > > In my setup kafka brokers are
set for auto topic
> >> creation.
> >> > >> In the
> >> > >> > > > > > > scenario
> >> > >> > > > > > > > below a node informs other
nodes (currently 5 in total)
> >> > >> about  a
> >> > >> > > > number
> >> > >> > > > > > > of
> >> > >> > > > > > > > new (non-existent) topics,
and  all the nodes almost
> >> > >> > > > simultaneously open
> >> > >> > > > > > > a
> >> > >> > > > > > > > consumer for each of those
topics. Sometime later
> >> another
> >> > >> node
> >> > >> > > > informs
> >> > >> > > > > > > all
> >> > >> > > > > > > > other nodes of a new list of
topics and each node, if
> >> they
> >> > >> find
> >> > >> > > > that an
> >> > >> > > > > > > > older topic exists in kafka,
goes ahead and deletes the
> >> > >> older
> >> > >> > > > topic.
> >> > >> > > > > > > What I
> >> > >> > > > > > > > have found is that many of
the topics stay in the
> >> "marked
> >> > >> for
> >> > >> > > > deletion"
> >> > >> > > > > > > > state forever.
> >> > >> > > > > > > >
> >> > >> > > > > > > >
> >> > >> > > > > > > > I get the list of topics using
> >> > >> ZkUtils.getAllTopics(zkClient) and
> >> > >> > > > delete
> >> > >> > > > > > > > topics using AdminUtils.deleteTopic(zkClient,
topic).
> >> Since
> >> > >> many
> >> > >> > > > nodes
> >> > >> > > > > > > > might try to delete the same
topic at the same time I
> >> do
> >> > >> > > > > > > > see ZkNodeExistsException while
deleting the topic,
> >> which I
> >> > >> catch
> >> > >> > > > an
> >> > >> > > > > > > > ignore. (e.g.,
> >> > >> > > > org.apache.zookeeper.KeeperException$NodeExistsException:
> >> > >> > > > > > > > KeeperErrorCode = NodeExists
for
> >> > >> > > > > > > > /admin/delete_topics/LAX1-GRIFFIN-r13-1423001701601)
> >> > >> > > > > > > >
> >> > >> > > > > > > > # State of one deleted topic
on kafka brokers:
> >> > >> > > > > > > > Topic:LAX1-GRIFFIN-r13-1423001701601
PartitionCount:8
> >> > >> > > > ReplicationFactor:1
> >> > >> > > > > > > > Configs:
> >> > >> > > > > > > > Topic: LAX1-GRIFFIN-r13-1423001701601
Partition: 0
> >> Leader:
> >> > >> -1
> >> > >> > > > Replicas:
> >> > >> > > > > > > > 49558 Isr:
> >> > >> > > > > > > > Topic: LAX1-GRIFFIN-r13-1423001701601
Partition: 1
> >> Leader:
> >> > >> -1
> >> > >> > > > Replicas:
> >> > >> > > > > > > > 49554 Isr:
> >> > >> > > > > > > > Topic: LAX1-GRIFFIN-r13-1423001701601
Partition: 2
> >> Leader:
> >> > >> -1
> >> > >> > > > Replicas:
> >> > >> > > > > > > > 49557 Isr:
> >> > >> > > > > > > > Topic: LAX1-GRIFFIN-r13-1423001701601
Partition: 3
> >> Leader:
> >> > >> -1
> >> > >> > > > Replicas:
> >> > >> > > > > > > > 49558 Isr:
> >> > >> > > > > > > > Topic: LAX1-GRIFFIN-r13-1423001701601
Partition: 4
> >> Leader:
> >> > >> -1
> >> > >> > > > Replicas:
> >> > >> > > > > > > > 49554 Isr:
> >> > >> > > > > > > > Topic: LAX1-GRIFFIN-r13-1423001701601
Partition: 5
> >> Leader:
> >> > >> -1
> >> > >> > > > Replicas:
> >> > >> > > > > > > > 49557 Isr:
> >> > >> > > > > > > > Topic: LAX1-GRIFFIN-r13-1423001701601
Partition: 6
> >> Leader:
> >> > >> -1
> >> > >> > > > Replicas:
> >> > >> > > > > > > > 49558 Isr:
> >> > >> > > > > > > > Topic: LAX1-GRIFFIN-r13-1423001701601
Partition: 7
> >> Leader:
> >> > >> -1
> >> > >> > > > Replicas:
> >> > >> > > > > > > > 49554 Isr:
> >> > >> > > > > > > >
> >> > >> > > > > > > >
> >> > >> > > > > > > > # Controller log says
> >> > >> > > > > > > >
> >> > >> > > > > > > > [2015-02-03 22:59:03,399] INFO
> >> [delete-topics-thread-49554],
> >> > >> > > > Deletion for
> >> > >> > > > > > > > replicas 49557,49554,49558
for partition
> >> > >> > > > > > > >
> >> > >> > > > > > >
> >> > >> > > >
> >> > >>
> >> [LAX1-GRIFFIN-r13-1423001701601,0],[LAX1-GRIFFIN-r13-1423001701601,6],[LAX1-GRIFFIN-r13-1423001701601,5],[LAX1-GRIFFIN-r13-1423001701601,3],[LAX1-GRIFFIN-r13-1423001701601,7],[LAX1-GRIFFIN-r13-1423001701601,1],[LAX1-GRIFFIN-r13-1423001701601,4],[LAX1-GRIFFIN-r13-1423001701601,2]
> >> > >> > > > > > > > of topic LAX1-GRIFFIN-r13-1423001701601
in progress
> >> > >> > > > > > > >
> >> (kafka.controller.TopicDeletionManager$DeleteTopicsThread)
> >> > >> > > > > > > >
> >> > >> > > > > > > > current time: Tue Feb  3 23:20:58
UTC 2015
> >> > >> > > > > > > >
> >> > >> > > > > > > >
> >> > >> > > > > > > > Since I don't know the delete
topic algorithm, I am not
> >> > >> sure why
> >> > >> > > > sure
> >> > >> > > > > > > these
> >> > >> > > > > > > > topics are not garbage collected.
I do have the
> >> complete
> >> > >> setup
> >> > >> > > > running in
> >> > >> > > > > > > > docker right now on my local
box so please let me know
> >> if
> >> > >> any more
> >> > >> > > > info
> >> > >> > > > > > > is
> >> > >> > > > > > > > required to troubleshoot this
issue.
> >> > >> > > > > > > >
> >> > >> > > > > > > >
> >> > >> > > > > > > > Furthermore, does ZkUtils.getAllTopics(zkClient)
return
> >> > >> "marked for
> >> > >> > > > > > > > deletion" topic as well. If
so, is there a easy way to
> >> get
> >> > >> a list
> >> > >> > > > of
> >> > >> > > > > > > active
> >> > >> > > > > > > > topics (other than looking
at all the topics in
> >> > >> > > > /admin/delete_topics/ and
> >> > >> > > > > > > > taking a set difference with
the topics returned
> >> > >> > > > > > > > by ZkUtils.getAllTopics(zkClient)
)
> >> > >> > > > > > > >
> >> > >> > > > > > > > Sumit
> >> > >> > > > > > > > (More setup info below)
> >> > >> > > > > > > >
> >> > >> > > > > > > >
> >> > >> > > > > > > > Setup
> >> > >> > > > > > > > --------
> >> > >> > > > > > > > Zookeeper: 3.4.6
> >> > >> > > > > > > > Kafka broker: 0.8.2-beta
> >> > >> > > > > > > > Kafka clients: 0.8.2-beta
> >> > >> > > > > > > >
> >> > >> > > > > > > > # Kafka boker settings (all
other settings are default
> >> > >> 0.8.2-beta
> >> > >> > > > > > > settings)
> >> > >> > > > > > > > kafka.controlled.shutdown.enable:
'FALSE'
> >> > >> > > > > > > > kafka.auto.create.topics.enable:
'TRUE'
> >> > >> > > > > > > > kafka.num.partitions: 8
> >> > >> > > > > > > > kafka.default.replication.factor:
1
> >> > >> > > > > > > > kafka.rebalance.backoff.ms:
3000
> >> > >> > > > > > > > kafka.rebalance.max.retries:
10
> >> > >> > > > > > > > kafka.log.retention.minutes:
1200
> >> > >> > > > > > > > kafka.delete.topic.enable:
'TRUE'
> >> > >> > > > > > >
> >> > >> > > > > > >
> >> > >> > > > >
> >> > >> > > >
> >> > >>
> >> > >
> >> > >
> >>
> >>
> >


Mime
View raw message