cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "B. Todd Burruss (Commented) (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-1956) Convert row cache to row+filter cache
Date Mon, 13 Feb 2012 23:59:00 GMT


B. Todd Burruss commented on CASSANDRA-1956:

i have very wide rows (140k columns) that i randomly query on, usually about 100 columns at
a time.  wide rows do not work well with the SerializingCacheProvider because of the constant
copying of data. ConcurrentLinkedHashMap performs very well, but eats memory because of all
the ByteBuffers.

I'm trying to understand if this will help my case.  head and tail caching will help folks
with time series data, but not me.  possibly the "handful of named columns" caching will help,
but there will be overlap in my queries so columns will exist in multiple cache entries, balooning
the cache.

what i was hoping for was a scheme to segment the wider row into smaller "segments" so not
as much copying is performed in the SerializingCacheProvider.

> Convert row cache to row+filter cache
> -------------------------------------
>                 Key: CASSANDRA-1956
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.2
>         Attachments: 0001-1956-cache-updates-v0.patch, 0001-commiting-block-cache.patch,
0001-re-factor-row-cache.patch, 0001-row-cache-filter.patch, 0002-1956-updates-to-thrift-and-avro-v0.patch,
> Changing the row cache to a row+filter cache would make it much more useful. We currently
have to warn against using the row cache with wide rows, where the read pattern is typically
a peek at the head, but this usecase would be perfect supported by a cache that stored only
columns matching the filter.
> Possible implementations:
> * (copout) Cache a single filter per row, and leave the cache key as is
> * Cache a list of filters per row, leaving the cache key as is: this is likely to have
some gotchas for weird usage patterns, and it requires the list overheard
> * Change the cache key to "rowkey+filterid": basically ideal, but you need a secondary
index to lookup cache entries by rowkey so that you can keep them in sync with the memtable
> * others?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message