cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Björn Hegerfors (JIRA) <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8371) DateTieredCompactionStrategy is always compacting
Date Mon, 01 Dec 2014 19:38:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230304#comment-14230304
] 

Björn Hegerfors commented on CASSANDRA-8371:
--------------------------------------------

[~jbellis] How about adding max_sstable_age_seconds and preferring it if both are set (or
give an error if both are set), without deprecating _days?

[~michaelsembwever] I didn't answer before, but since you don't write more than 50 MB per
hour, I don't think that base_time_seconds is the problem. I don't really have any ideas about
what could cause this increased IO. I suppose logging would help. DTCS logs exactly the same
things as STCS, but maybe some additional timestamp information would be useful to see as
well.

In CASSANDRA-6602 I attached TimestampViewer.java which takes all the *data.db files in a
data folder and outputs some relevant timestamp metadata (overlaps, for example). I find it
useful to look at its output sometimes on our DTCS clusters. I've also generated some images
from its output, which illustrates very well what DTCS sees. When I get time, I could clean
it up to make it work more generally, if anyone is interested. It's written in Haskell, using
the Diagrams library.

> DateTieredCompactionStrategy is always compacting 
> --------------------------------------------------
>
>                 Key: CASSANDRA-8371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: mck
>            Assignee: Björn Hegerfors
>              Labels: compaction, performance
>         Attachments: java_gc_counts_rate-month.png, read-latency-recommenders-adview.png,
read-latency.png, sstables-recommenders-adviews.png, sstables.png, vg2_iad-month.png
>
>
> Running 2.0.11 and having switched a table to [DTCS|https://issues.apache.org/jira/browse/CASSANDRA-6602]
we've seen that disk IO and gc count increase, along with the number of reads happening in
the "compaction" hump of cfhistograms.
> Data, and generally performance, looks good, but compactions are always happening, and
pending compactions are building up.
> The schema for this is 
> {code}CREATE TABLE search (
>   loginid text,
>   searchid timeuuid,
>   description text,
>   searchkey text,
>   searchurl text,
>   PRIMARY KEY ((loginid), searchid)
> );{code}
> We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
> CQL executed against this keyspace, and traffic patterns, can be seen in slides 7+8 of
https://prezi.com/b9-aj6p2esft/
> Attached are sstables-per-read and read-latency graphs from cfhistograms, and screenshots
of our munin graphs as we have gone from STCS, to LCS (week ~44), to DTCS (week ~46).
> These screenshots are also found in the prezi on slides 9-11.
> [~pmcfadin], [~Bj0rn], 
> Can this be a consequence of occasional deleted rows, as is described under (3) in the
description of CASSANDRA-6602 ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message