samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chinmay Soman <chinmay.cere...@gmail.com>
Subject Re: KafkaCheckPointManager is too slow
Date Wed, 04 Nov 2015 15:53:24 GMT
Hey Jae , we've turned off log compaction because of issues seen earlier
with the log compaction thread on kafka. We're still on 0.8.2 so we cant
really turn it on.

Our best bet is to set retention to like 3 hours or something.
On Nov 4, 2015 12:18 AM, "Bae, Jae Hyeon" <metacret@gmail.com> wrote:

> Also, I am using Samza 0.9.1.
>
> On Wed, Nov 4, 2015 at 12:18 AM, Bae, Jae Hyeon <metacret@gmail.com>
> wrote:
>
> > Hi Yi
> >
> > There are 8 partitions in the input topic. The size of checkpoint topic
> is
> > 26 MB.
> >
> > segment.bytes26214400cleanup.policycompact
> >
> > On Tue, Nov 3, 2015 at 11:12 PM, Yi Pan <nickpan47@gmail.com> wrote:
> >
> >> Hi, Jae,
> >>
> >> That's correct. I mentioned that just to confirm that checkpoint topic
> >> should be log-compacted. Then, the next question is: what's the size of
> >> the
> >> data in the checkpoint topic and how many input topic partitions you
> have
> >> in the job?
> >>
> >> It would also be helpful if you can share which version of samza you are
> >> using.
> >>
> >> Thanks!
> >>
> >> -Yi
> >>
> >> On Tue, Nov 3, 2015 at 11:03 PM, Bae, Jae Hyeon <metacret@gmail.com>
> >> wrote:
> >>
> >> > Hi Yi
> >> >
> >> > My colleague found that samza automatically set log compaction when
> >> > creating the checkpointing topic.
> >> >
> >> > Topic:__samza_checkpoint_ver_1_for_xxx_1 PartitionCount:1
> >> > ReplicationFactor:3
> >> Configs:segment.bytes=26214400,cleanup.policy=compact
> >> >
> >> > Topic: __samza_checkpoint_ver_1_for_xxx_1 Partition: 0 Leader: 66
> >> Replicas:
> >> > 66,24,65 Isr: 24,65,66
> >> >
> >> > So, the problem is not log-compaction.
> >> >
> >> >
> >> >
> >> > On Tue, Nov 3, 2015 at 8:33 PM, Yi Pan <nickpan47@gmail.com> wrote:
> >> >
> >> > > Hi, Bae,
> >> > >
> >> > > Where did you see this log? Is it in JobRunner? Or AppMaster? Or
> >> > > SamzaContainer?
> >> > >
> >> > > There are a few factors that may have the impact:
> >> > > 1. How many system stream partitions you have as the input? And how
> >> many
> >> > > tasks are there?
> >> > > 2. Did you set your checkpoint topic as log-compact topic in Kafka?
> >> The
> >> > > topic size would be much smaller if log compaction is turned on.
> >> > >
> >> > > Regards
> >> > >
> >> > > -Yi
> >> > >
> >> > > On Tue, Nov 3, 2015 at 3:59 PM, Bae, Jae Hyeon <metacret@gmail.com>
> >> > wrote:
> >> > >
> >> > > > Hi Samza Dev
> >> > > >
> >> > > > Do you know why the following job is taking too long?
> >> > > >
> >> > > > 2015-11-03 23:58:17 KafkaCheckpointManager [INFO] Get latest
> offset
> >> > > 3386930
> >> > > > for topic __samza_checkpoint_ver_1_for_xxx_1 and partition 0.
> >> > > >
> >> > > > This is seriously slowing down development. How can I fix this
> >> problem?
> >> > > >
> >> > > > Thank you
> >> > > > Best, Jae
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message