cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Shook (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8371) DateTieredCompactionStrategy is always compacting
Date Sun, 07 Dec 2014 06:03:13 GMT


Jonathan Shook commented on CASSANDRA-8371:

I tend to agree with Tupshin on the first point, which is to say that an occasional side-effect
of a needed repair should be small compared to the over-arching benefit of having a (tunably)
lower steady-state compaction load. My rationale, in detail is below. If I have made a mistake
somewhere in this explanation, please correct me. 

It is true that the boundary increases geometrically, but not necessarily true that this means
compaction load will be lower as the windows get larger. There is a distinction for the most
recent intervals simply because that is where memtable flushing and DTCS meet up, with the
expected variance in sstable sizes. I'll assume the implications of this are obvious and only
focus for now on later compactions.

If we had ideal scheduling of later compactions, each sstable would be coalesced exactly once
per interval size. This isn't what we will expect to see as a rule, but we have to start somewhere
for a reasonable estimate on the bounds. This means that our average compaction load would
tend towards a constant over time for each window, and that the average compaction load for
all active interval sizes would stack linearly depending on how many windows were accounted
for. This means that the compaction load is super-linear over time in the case of no max age.
 Even though the stacking effect does slow down over time, it's merely a slowing of increased
load, not the base load itself.

In contrast, given a max age and an average ingestion rate, the average steady-state compaction
load increases as each larger interval becomes active, but levels out at a maximum. If the
max age is low enough, then the effect can be significant. Considering that the load stacking
effect occurs more quickly in recent time but less quickly as time progresses, the adjustment
of max age closer to now() has the most visible effect. In other words, a max adjustment which
deactivates compaction at the 4th smallest interval size will have a less obvious effect that
one that deactivates the 3rd or 2nd.

Reducing the steady-state compaction load has significant advantages across the board in a
well-balanced system. Testing can easily show the correlation between higher average compaction
load and lower op rates and worsening latency spikes.

Requiring that the max be higher than the time it takes for a scheduled repair cycle would
rule out these types of adjustments. As well, the boundary between those two settings is pretty
fuzzy, considering that most automated repair schedules take a week or more.

There are also remedies, if you see that repairs are significantly affecting your larger intervals.
If you want want to have it be perfectly compacted, (probably not that important, in all honestly)
simply adjust the max age, let DTCS recompact the higher intervals, and then adjust it back,
or not. If I were having a significant amount of data being repaired on a routine basis, I'd
probably be scaling or tuning the system at that point, anyway. Repairs that have to stream
enough data to really become a problem for larger intervals should be considered a bad thing--
a sign that there are other pressures in the system that need to be addressed. However, a
limited amount of data being repaired, as in a healthy cluster be handled quite well enough
by IntervalTree, BloomFilter and friends.

I'm not advocating specifically for a  low default max, but I did want to explain the rationale
for not ruling it out as a valid choice in certain cases.

> DateTieredCompactionStrategy is always compacting 
> --------------------------------------------------
>                 Key: CASSANDRA-8371
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: mck
>            Assignee: Björn Hegerfors
>              Labels: compaction, performance
>         Attachments: java_gc_counts_rate-month.png, read-latency-recommenders-adview.png,
read-latency.png, sstables-recommenders-adviews.png, sstables.png, vg2_iad-month.png
> Running 2.0.11 and having switched a table to [DTCS|]
we've seen that disk IO and gc count increase, along with the number of reads happening in
the "compaction" hump of cfhistograms.
> Data, and generally performance, looks good, but compactions are always happening, and
pending compactions are building up.
> The schema for this is 
> {code}CREATE TABLE search (
>   loginid text,
>   searchid timeuuid,
>   description text,
>   searchkey text,
>   searchurl text,
>   PRIMARY KEY ((loginid), searchid)
> );{code}
> We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
> CQL executed against this keyspace, and traffic patterns, can be seen in slides 7+8 of
> Attached are sstables-per-read and read-latency graphs from cfhistograms, and screenshots
of our munin graphs as we have gone from STCS, to LCS (week ~44), to DTCS (week ~46).
> These screenshots are also found in the prezi on slides 9-11.
> [~pmcfadin], [~Bj0rn], 
> Can this be a consequence of occasional deleted rows, as is described under (3) in the
description of CASSANDRA-6602 ?

This message was sent by Atlassian JIRA

View raw message