cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1421) An eventually consistent approach to counting
Date Mon, 23 Aug 2010 15:34:18 GMT


Jonathan Ellis commented on CASSANDRA-1421:

Stu points out that sharding the rows is not inherent to the design and could be an optional
part 2.

> An eventually consistent approach to counting
> ---------------------------------------------
>                 Key: CASSANDRA-1421
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>             Fix For: 0.7.0
> Counters may be implemented as multiple rows in a column family; that is, counters will
have a configurable shard parameter; a shard factor of 128 would have 128 rows.
> An increment will be a (uuid, count) name, value tuple.  The row shard will be uuid %
shardfactor.  Timestamp is ignored.  This could be implemented w/ the existing Thrift write
api, or we could add a special case method for it.  Either is fine; the main advantage of
the former is it lets increments be included in batch mutations.
> (Decrements we get for free as simply negative values.)
> Each node will be responsible for aggregating *the rows replicated to it* after GCGraceSeconds
have elapsed.  Count aggregation will be a scheduled task on each machine.  This will require
a mutex for each shard vs both writes and reads.
> This will not have the conflict resolution problem of CASSANDRA-580, or the write fragility
of CASSANDRA-1072.  Normal CL will apply on both read and write.  Write idempotentcy is preserved.
 I expect writes will be faster than either, since no reads are required at all on the write
path.  Reads will be slower, but the read overhead can be reduced by lowering GCGraceSeconds
to below your repair frequency if you are okay with the durability tradeoff there (it will
not be worse than CASSANDRA-1072, for instance).  More disk space will be used by this approach,
but that is the cheapest resource we have.
> Special case code required will be much less than either the 580 or 1072 approach --
primarily some code in StorageProxy to combine the uuid slices with their aggregation columns
and sum them for all the shards, the local aggregation code, and minor changes to read/write
path to add the mutex vs aggregation.
> We could also get rid of the Clock change and go back to i64 timestamps; if we're not
going to use Clocks for increments I don't think they have much raison d'ĂȘtre.  (Those of
you just joining us, see for background.)  The CASSANDRA-1072
approach doesn't use Clocks either, or rather, it uses Clocks but not a byte[] value, which
really means the Clock is unnecessary.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message