cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "LargeDataSetConsiderations" by PeterSchuller
Date Wed, 13 Apr 2011 03:21:37 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "LargeDataSetConsiderations" page has been changed by PeterSchuller.
The comment on this change is: Reflect that CASSANDRA-2191 may be addressing compaction concurrency
for 0.8.
http://wiki.apache.org/cassandra/LargeDataSetConsiderations?action=diff&rev1=17&rev2=18

--------------------------------------------------

   * Prior to 0.7.1 (fixed in [[https://issues.apache.org/jira/browse/CASSANDRA-1555|CASSANDRA-1555]]),
if you had column families with more than 143 million row keys in them, bloom filter false
positive rates would be likely to go up because of implementation concerns that limited the
maximum size of a bloom filter. See [[ArchitectureInternals]] for information on how bloom
filters are used. The negative effects of hitting this limit is that reads will start taking
additional seeks to disk as the row count increases. Note that the effect you are seeing at
any given moment will depend on when compaction was last run, because the bloom filter limit
is per-sstable. It is an issue for column families because after a major compaction, the entire
column family will be in a single sstable.
   * Compaction is currently not concurrent, so only a single compaction runs at a time. This
means that sstable counts may spike during larger compactions as several smaller sstables
are written while a large compaction is happening. This can cause additional seeks on reads.
    * Potential future improvements: [[https://issues.apache.org/jira/browse/CASSANDRA-1876|CASSANDRA-1876]]
and [[https://issues.apache.org/jira/browse/CASSANDRA-1881|CASSANDRA-1881]]
+   * Potentially already fixed for 0.8 (todo: go through ticket history and make sure what
it implies): [[https://issues.apache.org/jira/browse/CASSANDRA-2191|CASSANDRA-2191]]
   * Consider the choice of file system. Removal of large files is notoriously slow and seek
bound on e.g. ext2/ext3. Consider xfs or ext4fs. This affects background unlink():ing of sstables
that happens every now and then, and also affects start-up time (if there are sstables pending
removal when a node is starting up, they are removed as part of the start-up proceess; it
may thus be detrimental if removing a terrabyte of sstables takes an hour (numbers are ballparks,
not accurately measured and depends on circumstances)).
   * Adding nodes is a slow process if each node is responsible for a large amount of data.
Plan for this; do not try to throw additional hardware at a cluster at the last minute.
   * Cassandra will read through sstable index files on start-up, doing what is known as "index
sampling". This is used to keep a subset (currently and by default, 1 out of 100) of keys
and and their on-disk location in the index, in memory. See [[ArchitectureInternals]]. This
means that the larger the index files are, the longer it takes to perform this sampling. Thus,
for very large indexes (typically when you have a very large number of keys) the index sampling
on start-up may be a significant issue.

Mime
View raw message