kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: What does RowSet Compaction Duration means?
Date Wed, 15 Mar 2017 01:29:23 GMT
Hi Jason,

I can't tell from your screenshot which entity it's on, but if you're
looking at an aggregate across a table, then it's certainly possible that 5
different tablet replicas are being compacted at a time.

Another way to interpret the "milliseconds per second" is a percentage --
500%. In other words, 5 cores in your cluster are working on compacting
parts of that table.

If you are heavily bulk loading one table, and another one is compacting,
it's probably because the compacting one was just inserted into fairly
recently. In general Kudu tries to compact the table most in need of
compaction.

One issue that you might be hitting is that it currently can prioritize
compaction over flushes in such a way that will cause memory to full up and
impact throughput. I'm working on a bunch of notes about this phenomenon
right now in fact --
https://docs.google.com/document/d/1U1IXS1XD2erZyq8_qG81A1gZaCeHcq2i0unea_eEf5c/edit#
has a lot of the gory details. You can expect this to be improved in future
releases.

-Todd

On Tue, Mar 14, 2017 at 6:10 PM, Jason Heo <jason.heo.sde@gmail.com> wrote:

> Hi Alexey.
>
> Thank you for your reply.
>
> With your help, now I can understand what 'compact_rs_duration` means. But
> the `default_num_replicas` is just 3 not 5 :(
>
> It seems compaction on tableB affects huge on bulk loading on tableA. Is
> there a way to minimize compaction activities? (something like changing
> configuration of Kudu)
>
> The FAQ says that "Since compactions are so predictable, the only tuning
> knob available is the number of threads dedicated to flushes and
> compactions in the *maintenance manager*."
>
> my `maintenance_manager_num_threads` is already 1.
>
> Thanks.
>
> 2017-03-15 3:48 GMT+09:00 Alexey Serbin <aserbin@cloudera.com>:
>
>> Hi Jason,
>>
>> As I understand, that 'milliseconds / second' cryptic unit means 'number
>> of units / for sampling (or averaging) interval'.
>>
>> I.e., they capture that metric reading (expressed in milliseconds) every
>> second, subtract previous value from the current value, and declare the
>> result as the result measurement at current time.  If not capturing every
>> second, then it's about measuring every X seconds, do the subtraction of
>> the previous from the current measurement, and then divide by X.
>>
>> For a single tablet, the 'compact_rs_duration' metric stands for 'Time
>> spent compacting RowSets'.  As I understand, that
>> 'total_kudu_compact_rs_duration_sum_rate_across_kudu_replicas' is
>> sum/accumulation of those measurements for all existing replicas of the
>> specified tablet across Kudu cluster.
>>
>> I suspect you have the replication factor of 5 for that tablet, and at
>> some point all replicas become busy with rowset compaction all the time.
>>
>> Compactions on tables are run in the background.  Compactions on
>> different tables run independently.  So, if you have some other activity
>> doing inserts/updates on tableB, then it's natural to see compaction happen
>> on tabletB as well.
>>
>>
>> Best regards,
>>
>> Alexey
>>
>> On Tue, Mar 14, 2017 at 12:50 AM, Jason Heo <jason.heo.sde@gmail.com>
>> wrote:
>>
>>> Hi.
>>>
>>> I'm stuck with performance degradation on compaction happens.
>>>
>>> My Duration is "4956.71 milliseconds / second" What does this mean? I
>>> can't figure it out.
>>>
>>> Here is the captured image: http://imgur.com/WU9sRRq
>>>
>>> When I'm doing bulk indexing on tableA, sometimes compaction happens
>>> over tableB. Is this situation is natural?
>>>
>>> Thanks.
>>>
>>
>>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message