kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guozhang Wang <wangg...@gmail.com>
Subject Re: Strange partitioning behavior with 0.8.1.1
Date Wed, 11 Jun 2014 21:25:15 GMT
In console producer you can specify the producer properties in command line
as metadata-expiry-ms.

You can type just ./kafka-console-producer.sh and it will show you all the
configs that you can specify.

Guozhang


On Wed, Jun 11, 2014 at 10:56 AM, Prakash Gowri Shankor <
prakash.shankor@gmail.com> wrote:

> Guozhang,
>
> I set this in my producer.properties
>
> topic.metadata.refresh.interval.ms=1000
>
> Then I start the console producer as
>
> ./kafka-console-producer.sh --broker-list localhost:9092 --topic test2
>
> I still dont see data being written to different partitions after every 1
> second.
>
> I wonder if the producer is picking up the properties file - I dont see it
> being passed explicitly in the script to the kafka.producer.ConsoleProducer
> class.
>
> -Prakash
>
>
> On Tue, Jun 10, 2014 at 11:04 AM, Guozhang Wang <wangguoz@gmail.com>
> wrote:
>
> > Yes, reducing the refresh interval to 100ms will cause it to try to
> select
> > another partition every 100ms, not necessarily a different partition
> tough,
> > since it just gets a next random int % num.partitions.
> >
> > Setting the key can also resolve this issue, as long as the key values
> are
> > evenly distributed, since the partition selected is effectively based on
> > key values.
> >
> > Guozhang
> >
> >
> > On Tue, Jun 10, 2014 at 9:54 AM, Prakash Gowri Shankor <
> > prakash.shankor@gmail.com> wrote:
> >
> > > Can you please tell me how to set this property ?
> > > topic.metadata.refresh.interval.ms
> > > Is a value of 100 low enough to solve this issue ?
> > > Im guessing I can set it to 100 and restart the command line producer
> and
> > > the partitioning should work ? Please confirm.
> > >
> > > Thanks
> > >
> > >
> > > On Mon, Jun 9, 2014 at 5:09 PM, Prakash Gowri Shankor <
> > > prakash.shankor@gmail.com> wrote:
> > >
> > > > Thank you Guozhang.
> > > > I've specified how i set and use the property in my previous mail.
> Can
> > > you
> > > > tell me if that is fine ?
> > > > I also noticed that the kafka-console-producer.sh takes a custom
> > > > property(key-value) on the command line. Would it help to set this
> > > property
> > > > directly on the command line of the producer script ?
> > > >
> > > >
> > > > On Mon, Jun 9, 2014 at 5:06 PM, Guozhang Wang <wangguoz@gmail.com>
> > > wrote:
> > > >
> > > >> In the new producer we are changing the default behavior back to
> pure
> > > >> random partitioning and let users to customize their own
> partitioning
> > > >> schemes if they want. For now reducing
> > > topic.metadata.refresh.interval.ms
> > > >> should help because the stickiness only persists until a metadata
> > > refresh.
> > > >>
> > > >> Guozhang
> > > >>
> > > >>
> > > >> On Mon, Jun 9, 2014 at 4:54 PM, Prakash Gowri Shankor <
> > > >> prakash.shankor@gmail.com> wrote:
> > > >>
> > > >> > Is there a way to modify this duration ? This is not adhering
to
> the
> > > >> > "random" behavior that the documentation talks about.
> > > >> >
> > > >> >
> > > >> > On Mon, Jun 9, 2014 at 4:41 PM, Kane Kane <kane.isturm@gmail.com>
> > > >> wrote:
> > > >> >
> > > >> > > Last time I've checked it, producer sticks to partition
for 10
> > > >> minutes.
> > > >> > >
> > > >> > > On Mon, Jun 9, 2014 at 4:13 PM, Prakash Gowri Shankor
> > > >> > > <prakash.shankor@gmail.com> wrote:
> > > >> > > > Hi,
> > > >> > > >
> > > >> > > > This is with 0.8.1.1 and I ran the command line console
> > consumer.
> > > >> > > > I have one broker, one producer and several consumers.
I have
> > one
> > > >> > topic,
> > > >> > > > many partitions m, many consumers n, m=n , one consumer
group
> > > >> defined
> > > >> > for
> > > >> > > > all the consumers
> > > >> > > >
> > > >> > > > From using Kafka Monitor, I see that each partition
is
> assigned
> > to
> > > >> one
> > > >> > > > consumer now. However, it seems that there is no parallelism
> in
> > > data
> > > >> > > > consumption. What I see happening is that one consumer
gets
> > > messages
> > > >> > from
> > > >> > > > time t0 to t1 from partition P1. Then another consumer
gets
> > > messages
> > > >> > from
> > > >> > > > t1 to t2 from partition P2 and so on.
> > > >> > > >
> > > >> > > > *Why is there no parallel consumption happening ?*
It looks to
> > me
> > > >> that
> > > >> > > the
> > > >> > > > producer's data goes into P1 from t0 to t1 and then
from t1 to
> > t2
> > > >> into
> > > >> > > P2.
> > > >> > > > I thought that if I dont specify a partitioning key,
the
> > > producer's
> > > >> > data
> > > >> > > > will get partitioned randomly. It's just that the randomness
> > seems
> > > >> to
> > > >> > be
> > > >> > > > "delayed". Why is this so ?
> > > >> > > >
> > > >> > > > I tried setting topic.metadata.refresh.interval.ms=100
in the
> > > >> > > > producer.properties.
> > > >> > > >
> > > >> > > > That did not seem to change this strange partitioning
> behavior.
> > > >> > > >
> > > >> > > > Please help.
> > > >> > > >
> > > >> > > > Thanks
> > > >> > >
> > > >> >
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> -- Guozhang
> > > >>
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message