cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kirk True (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3974) Per-CF TTL
Date Wed, 11 Apr 2012 00:05:22 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251196#comment-13251196
] 

Kirk True commented on CASSANDRA-3974:
--------------------------------------

Jonathan, thanks for the feedback.

I need a bit of clarification for a newbie hacking on the code...

bq. Looks like this only updates the CQL path? We'd want to make the Thrift path cf-ttl-aware
as well. I think this just means updating RowMutation + CF addColumn methods.

I actually thought the opposite. Part of the code I changed was in {{CFMetaData}}'s {{toThrift}}
and {{fromThrift}} methods. Perhaps I'm reading too much into the method names?

But I took a look at {{ColumnFamily}}'s {{addColumn}} method, but it already performs the
conditional based on the TTL value.

bq. Nit: we could simplify getTTL a bit by adding assert ttl > 0.

Sorry, I'm not sure to which part of the code you're referring :( Can you elaborate?

bq.    I got it backwards: we want max(cf ttl, column ttl) to be able to reason about the
live-ness of CF data w/o looking at individual rows

I cleaned up the {{CFMetaData.getTimeToLive}} method, which is now simply:

{noformat}
public int getTimeToLive(int timeToLive)
{
    return Math.max(defaultTimeToLive, timeToLive);
}
{noformat}

bq.    We can break the compaction optimizations into another ticket. It really needs a separate
compaction Strategy; the idea is if we have an sstable A older than CF ttl, then all the data
in the file is dead and we can just delete the file without looking at it row-by-row. However,
there's a lot of tension there with the goal of normal compaction, which wants to merge different
versions of the same row, so we're going to churn a lot with a low chance of ever having an
sstable last the full TTL without being merged, effectively restarting our timer. So, I think
we're best served by a ArchivingCompactionStrategy that doesn't merge sstables at all, just
drops obsolete ones, and let people use that for append-only insert workloads. Which is a
common enough case that it's worth the trouble... probably.

Either way is fine. Would love to contribute.
                
> Per-CF TTL
> ----------
>
>                 Key: CASSANDRA-3974
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Assignee: Kirk True
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: trunk-3974.txt
>
>
> Per-CF TTL would allow compaction optimizations ("drop an entire sstable's worth of expired
data") that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message