cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "FAQ" by JonathanEllis
Date Mon, 14 Jun 2010 13:17:28 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "FAQ" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/FAQ?action=diff&rev1=72&rev2=73

--------------------------------------------------

  
   * The main limitation on a column and super column size is that all the data for a single
key and column must fit (on disk) on a single machine(node) in the cluster.  Because keys
alone are used to determine the nodes responsible for replicating their data, the amount of
data associated with a single key has this upper bound. This is an inherent limitation of
the distribution model.
  
-  * When large columns are created and retrieved, that columns data is loaded into RAM which
 can get resource intensive quickly.  Consider, loading  200 rows with columns  that store
10Mb image files each into RAM.  That small result set would consume about 2Gb of RAM.  Clearly
as more and more large columns are loaded,  RAM would start to get consumed quickly.  This
can be worked around, but will take some upfront planning and testing to get a workable solution
for most applications.  You can find more information regarding this behavior here: [[MemtableThresholds|memtables]],
and a possible solution in 0.7 here: [[https://issues.apache.org/jira/browse/CASSANDRA-16|CASSANDRA-16
]].
+  * When large columns are created and retrieved, that columns data is loaded into RAM which
 can get resource intensive quickly.  Consider, loading  200 rows with columns  that store
10Mb image files each into RAM.  That small result set would consume about 2Gb of RAM.  Clearly
as more and more large columns are loaded,  RAM would start to get consumed quickly.  This
can be worked around, but will take some upfront planning and testing to get a workable solution
for most applications.  You can find more information regarding this behavior here: [[MemtableThresholds|memtables]],
and a possible solution in 0.7 here: [[https://issues.apache.org/jira/browse/CASSANDRA-16|CASSANDRA-16]].
  
   * Please refer to the notes in the Cassandra limitations section for more information:
[[CassandraLimitations|Cassandra Limitations]]
  
@@ -258, +258 @@

  <<Anchor(a_long_is_exactly_8_bytes)>>
  
  == Insert operation throws InvalidRequestException with message "A long is exactly 8 bytes"
==
- 
  You are propably using !LongType column sorter in your column family. !LongType assumes
that the numbers stored into column names are exactly 64bit (8 bytes) long and in big endian
format. Example code how to pack and unpack an integer for storing into cassandra and unpacking
it for php:
  
  {{{
- 	/**
+         /**
- 	 * Takes php integer and packs it to 64bit (8 bytes) long big endian binary representation.
+          * Takes php integer and packs it to 64bit (8 bytes) long big endian binary representation.
- 	 * @param  $x integer
+          * @param  $x integer
- 	 * @return string eight bytes long binary repersentation of the integer in big endian order.
+          * @return string eight bytes long binary repersentation of the integer in big endian
order.
- 	 */
+          */
- 	public static function pack_longtype($x) {
+         public static function pack_longtype($x) {
- 		return pack('C8', ($x >> 56) & 0xff, ($x >> 48) & 0xff,	($x >>
40) & 0xff, ($x >> 32) & 0xff,
+                 return pack('C8', ($x >> 56) & 0xff, ($x >> 48) & 0xff,
($x >> 40) & 0xff, ($x >> 32) & 0xff,
- 				($x >> 24) & 0xff, ($x >> 16) & 0xff, ($x >> 8) & 0xff,
$x & 0xff);
+                                 ($x >> 24) & 0xff, ($x >> 16) & 0xff,
($x >> 8) & 0xff, $x & 0xff);
- 	}
+         }
  
- 	/**
+         /**
- 	 * Takes eight bytes long big endian binary representation of an integer and unpacks it
to a php integer.
+          * Takes eight bytes long big endian binary representation of an integer and unpacks
it to a php integer.
- 	 * @param  $x
+          * @param  $x
- 	 * @return php integer
+          * @return php integer
- 	 */
+          */
- 	public static function unpack_longtype($x) {
+         public static function unpack_longtype($x) {
- 		$a = unpack('C8', $x);
+                 $a = unpack('C8', $x);
- 		return ($a[1] << 56) + ($a[2] << 48) + ($a[3] << 40) + ($a[4] <<
32) + ($a[5] << 24) + ($a[6] << 16) + ($a[7] << 8) + $a[8];
+                 return ($a[1] << 56) + ($a[2] << 48) + ($a[3] << 40) +
($a[4] << 32) + ($a[5] << 24) + ($a[6] << 16) + ($a[7] << 8) + $a[8];
- 	}
+         }
  }}}
- 
  <<Anchor(clustername_mismatch)>>
  
  == Cassandra says "ClusterName mismatch: oldClusterName != newClusterName" and refuses to
start ==
- 
  To prevent operator errors, Cassandra stores the name of the cluster in its system table.
 If you need to rename a cluster for some reason, it is safe to remove system/LocationInfo*
after forcing a compaction on all ColumnFamilies (with the old cluster name) if you've specified
the node's token in the config file, or if you don't care about preserving the node's token
(for instance in single node clusters.)
  
  <<Anchor(batch_mutate_atomic)>>
+ 
  == Are batch_mutate operations atomic? ==
- 
- No.  [[API#batch_mutate|batch_mutate]] is a way to group many operations into a single call
in order to save on the cost of network round-trips.  If `batch_mutate` fails in the middle
of its list of mutations, no rollback occurs and the mutations that have already been applied
stay applied. 
+ No.  [[API#batch_mutate|batch_mutate]] is a way to group many operations into a single call
in order to save on the cost of network round-trips.  If `batch_mutate` fails in the middle
of its list of mutations, no rollback occurs and the mutations that have already been applied
stay applied. The client should typically retry the mutation.
  

Mime
View raw message