samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bae, Jae Hyeon" <metac...@gmail.com>
Subject Re: KafkaCheckPointManager is too slow
Date Wed, 04 Nov 2015 08:18:06 GMT
Hi Yi

There are 8 partitions in the input topic. The size of checkpoint topic is
26 MB.

segment.bytes26214400cleanup.policycompact

On Tue, Nov 3, 2015 at 11:12 PM, Yi Pan <nickpan47@gmail.com> wrote:

> Hi, Jae,
>
> That's correct. I mentioned that just to confirm that checkpoint topic
> should be log-compacted. Then, the next question is: what's the size of the
> data in the checkpoint topic and how many input topic partitions you have
> in the job?
>
> It would also be helpful if you can share which version of samza you are
> using.
>
> Thanks!
>
> -Yi
>
> On Tue, Nov 3, 2015 at 11:03 PM, Bae, Jae Hyeon <metacret@gmail.com>
> wrote:
>
> > Hi Yi
> >
> > My colleague found that samza automatically set log compaction when
> > creating the checkpointing topic.
> >
> > Topic:__samza_checkpoint_ver_1_for_xxx_1 PartitionCount:1
> > ReplicationFactor:3 Configs:segment.bytes=26214400,cleanup.policy=compact
> >
> > Topic: __samza_checkpoint_ver_1_for_xxx_1 Partition: 0 Leader: 66
> Replicas:
> > 66,24,65 Isr: 24,65,66
> >
> > So, the problem is not log-compaction.
> >
> >
> >
> > On Tue, Nov 3, 2015 at 8:33 PM, Yi Pan <nickpan47@gmail.com> wrote:
> >
> > > Hi, Bae,
> > >
> > > Where did you see this log? Is it in JobRunner? Or AppMaster? Or
> > > SamzaContainer?
> > >
> > > There are a few factors that may have the impact:
> > > 1. How many system stream partitions you have as the input? And how
> many
> > > tasks are there?
> > > 2. Did you set your checkpoint topic as log-compact topic in Kafka? The
> > > topic size would be much smaller if log compaction is turned on.
> > >
> > > Regards
> > >
> > > -Yi
> > >
> > > On Tue, Nov 3, 2015 at 3:59 PM, Bae, Jae Hyeon <metacret@gmail.com>
> > wrote:
> > >
> > > > Hi Samza Dev
> > > >
> > > > Do you know why the following job is taking too long?
> > > >
> > > > 2015-11-03 23:58:17 KafkaCheckpointManager [INFO] Get latest offset
> > > 3386930
> > > > for topic __samza_checkpoint_ver_1_for_xxx_1 and partition 0.
> > > >
> > > > This is seriously slowing down development. How can I fix this
> problem?
> > > >
> > > > Thank you
> > > > Best, Jae
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message