cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-14605) Major compaction of LCS tables very slow
Date Fri, 27 Jul 2018 12:07:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-14605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559654#comment-16559654
] 

Benedict commented on CASSANDRA-14605:
--------------------------------------

Probably the issue is that, with an LCS major compaction, we do a great deal that is unnecessary.
By definition most of the sstables will not intersect with the recently modified position
of the latest sstable, and by default LCS has very small sstables - so there are a great
many will be unnecessarily looping over.

I think we can improve the status quo quite straightforwardly, but I think we should probably
revisit the whole approach of managing the key cache here once we have done so, as this code
has been around since time immemorial, and may not translate to our current architecture
so well.

 

> Major compaction of LCS tables very slow
> ----------------------------------------
>
>                 Key: CASSANDRA-14605
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14605
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>         Environment: AWS, i3.4xlarge instance (very fast local nvme storage), Linux 4.13
> Cassandra 3.0.16
>            Reporter: Joseph Lynch
>            Assignee: Benedict
>            Priority: Minor
>              Labels: lcs, performance
>         Attachments: slow_major_compaction_lcs.svg
>
>
> We've recently started deploying 3.0.16 more heavily in production and today I noticed
that full compaction of LCS tables takes a much longer time than it should. In particular
it appears to be faster to convert a large dataset to STCS, run full compaction, and then
convert it to LCS (with re-leveling) than it is to just run full compaction on LCS (with re-leveling).
> I was able to get a CPU flame graph showing 50% of the major compaction's cpu time being
spent in [{{SSTableRewriter::maybeReopenEarly}}|https://github.com/apache/cassandra/blob/6ba2fb9395226491872b41312d978a169f36fcdb/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java#L184]
calling [{{SSTableRewriter::moveStarts}}|https://github.com/apache/cassandra/blob/6ba2fb9395226491872b41312d978a169f36fcdb/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java#L223].
> I've attached the flame graph here which was generated by running Cassandra using {{-XX:+PreserveFramePointer}},
then using jstack to get the compaction native thread id (nid) which I then used perf to get
on cpu time:
> {noformat}
> perf record -t <compaction thread> -o <output file> -F 49 -g sleep 60 >/dev/null
> {noformat}
> I took this data and collapsed it using the steps talked about in [Brendan Gregg's java
in flames blogpost|https://medium.com/netflix-techblog/java-in-flames-e763b3d32166] (Instructions
section) to generate the graph.
> The results are that at least on this dataset (700GB of data compressed, 2.2TB uncompressed),
we are spending 50% of our cpu time in {{moveStarts}} and I am unsure that we need to be doing
that as frequently as we are. I'll see if I can come up with a clean reproduction to confirm
if it's a general problem or just on this particular dataset.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message