samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bae, Jae Hyeon" <metac...@gmail.com>
Subject Re: KafkaCheckPointManager is too slow
Date Wed, 04 Nov 2015 08:18:19 GMT
Also, I am using Samza 0.9.1.

On Wed, Nov 4, 2015 at 12:18 AM, Bae, Jae Hyeon <metacret@gmail.com> wrote:

> Hi Yi
>
> There are 8 partitions in the input topic. The size of checkpoint topic is
> 26 MB.
>
> segment.bytes26214400cleanup.policycompact
>
> On Tue, Nov 3, 2015 at 11:12 PM, Yi Pan <nickpan47@gmail.com> wrote:
>
>> Hi, Jae,
>>
>> That's correct. I mentioned that just to confirm that checkpoint topic
>> should be log-compacted. Then, the next question is: what's the size of
>> the
>> data in the checkpoint topic and how many input topic partitions you have
>> in the job?
>>
>> It would also be helpful if you can share which version of samza you are
>> using.
>>
>> Thanks!
>>
>> -Yi
>>
>> On Tue, Nov 3, 2015 at 11:03 PM, Bae, Jae Hyeon <metacret@gmail.com>
>> wrote:
>>
>> > Hi Yi
>> >
>> > My colleague found that samza automatically set log compaction when
>> > creating the checkpointing topic.
>> >
>> > Topic:__samza_checkpoint_ver_1_for_xxx_1 PartitionCount:1
>> > ReplicationFactor:3
>> Configs:segment.bytes=26214400,cleanup.policy=compact
>> >
>> > Topic: __samza_checkpoint_ver_1_for_xxx_1 Partition: 0 Leader: 66
>> Replicas:
>> > 66,24,65 Isr: 24,65,66
>> >
>> > So, the problem is not log-compaction.
>> >
>> >
>> >
>> > On Tue, Nov 3, 2015 at 8:33 PM, Yi Pan <nickpan47@gmail.com> wrote:
>> >
>> > > Hi, Bae,
>> > >
>> > > Where did you see this log? Is it in JobRunner? Or AppMaster? Or
>> > > SamzaContainer?
>> > >
>> > > There are a few factors that may have the impact:
>> > > 1. How many system stream partitions you have as the input? And how
>> many
>> > > tasks are there?
>> > > 2. Did you set your checkpoint topic as log-compact topic in Kafka?
>> The
>> > > topic size would be much smaller if log compaction is turned on.
>> > >
>> > > Regards
>> > >
>> > > -Yi
>> > >
>> > > On Tue, Nov 3, 2015 at 3:59 PM, Bae, Jae Hyeon <metacret@gmail.com>
>> > wrote:
>> > >
>> > > > Hi Samza Dev
>> > > >
>> > > > Do you know why the following job is taking too long?
>> > > >
>> > > > 2015-11-03 23:58:17 KafkaCheckpointManager [INFO] Get latest offset
>> > > 3386930
>> > > > for topic __samza_checkpoint_ver_1_for_xxx_1 and partition 0.
>> > > >
>> > > > This is seriously slowing down development. How can I fix this
>> problem?
>> > > >
>> > > > Thank you
>> > > > Best, Jae
>> > > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message