cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Eriksson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted
Date Wed, 27 Jun 2018 11:14:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524923#comment-16524923
] 

Marcus Eriksson commented on CASSANDRA-14423:
---------------------------------------------

the patches lgtm

> SSTables stop being compacted
> -----------------------------
>
>                 Key: CASSANDRA-14423
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: Kurt Greaves
>            Assignee: Kurt Greaves
>            Priority: Blocker
>             Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and not being
included in compactions/as candidates for compaction. It seems to get progressively worse
until there's only 1-2 SSTables in the view which happen to be the most recent SSTables and
thus compactions completely stop for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into the view,
but this is only a temporary fix. User defined/major compactions still work - not clear if
they include the result back in the view but is not a good work around.
> This also results in a discrepancy between SSTable count and SSTables in levels for any
table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.00000
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the number of SSTables
reported in the Compaction Bucket's. In the below example there's only 3 SSTables in a single
bucket - no more are listed for this table. Compaction thresholds haven't been modified for
this table and it's a very basic KV schema.
> {code:java}
> Keyspace : yyy
>     Read Count: 30485
>     Read Latency: 0.06708991307200263 ms.
>     Write Count: 57044
>     Write Latency: 0.02204061776873992 ms.
>     Pending Flushes: 0
>         Table: yyy
>         SSTable count: 19
>         Space used (live): 18195482
>         Space used (total): 18195482
>         Space used by snapshots (total): 0
>         Off heap memory used (total): 747376
>         SSTable Compression Ratio: 0.7607394576769735
>         Number of keys (estimate): 116074
>         Memtable cell count: 0
>         Memtable data size: 0
>         Memtable off heap memory used: 0
>         Memtable switch count: 39
>         Local read count: 30485
>         Local read latency: NaN ms
>         Local write count: 57044
>         Local write latency: NaN ms
>         Pending flushes: 0
>         Percent repaired: 79.76
>         Bloom filter false positives: 0
>         Bloom filter false ratio: 0.00000
>         Bloom filter space used: 690912
>         Bloom filter off heap memory used: 690760
>         Index summary off heap memory used: 54736
>         Compression metadata off heap memory used: 1880
>         Compacted partition minimum bytes: 73
>         Compacted partition maximum bytes: 124
>         Compacted partition mean bytes: 96
>         Average live cells per slice (last five minutes): NaN
>         Maximum live cells per slice (last five minutes): 0
>         Average tombstones per slice (last five minutes): NaN
>         Maximum tombstones per slice (last five minutes): 0
>         Dropped Mutations: 0 
> {code}
> {code:java}
> Apr 27 03:10:39 cassandra[9263]: TRACE o.a.c.d.c.SizeTieredCompactionStrategy Compaction
buckets are [[BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd39a64fd59/mc-67168-big-Data.db'),
BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd39a64fd59/mc-67167-big-Data.db'),
BigTableReader(path='/var/lib/cassandra/data/yyy/yyy-5f7a2d60e4a811e6868a8fd39a64fd59/mc-67166-big-Data.db')]]
> {code}
> Also for every LCS table we're seeing the following warning being spammed (seems to be
in line with anticompaction spam):
> {code:java}
> Apr 26 21:30:09 cassandra[9263]: WARN  o.a.c.d.c.LeveledCompactionStrategy Live sstable
/var/lib/cassandra/data/xxx/xxx-8c3ef9e0e3fc11e6868a8fd39a64fd59/mc-79024-big-Data.db from
level 0 is not on corresponding level in the leveled manifest. This is not a problem per se,
but may indicate an orphaned sstable due to a failed compaction not cleaned up properly.{code}
> This is a vnodes cluster with 256 tokens per node, and the only thing that seems like
it could be causing issues is anticompactions.
> CASSANDRA-14079 might be related but doesn't quite describe the same issue, and in this
case we're using only a single disk for data. Have yet to reproduce but figured worth reporting
here first.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message