cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "T Jake Luciani (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1902) Migrate cached pages during compaction
Date Thu, 17 Feb 2011 17:53:24 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995947#comment-12995947
] 

T Jake Luciani commented on CASSANDRA-1902:
-------------------------------------------

bq. Can you give an overview of what the patch does?

At a high level this is to lessen the affect of sstable compaction and cleanup on reads, it
does this by letting compaction figure out what rows in a SSTable are in the OS page cache
and make sure that the compacted row remains in the page cache. Remember post CASSANDRA-1470
we are not putting anything in the page cache during compaction. 

bq. Where and how you "track contiguous pages per key," why that is the right solution

This is the right solution because the worst case is the same as the current code today. It
can really only help because it's just giving the OS hints, it's upto the OS to do with that
info what it thinks is best.

The important piece is in CLibrary.getCachedPages(File file, int minContiguousPages)

This takes a file and mmaps it in 2G chunks then uses the posix mincore() call to get a vector
of which pages in the range are actually cached (for a totally unread file this is []). We
use the starting offset + (pagecache_size * each mapped page) to return a vector of positions
on disk. we use the minContiguousPages to filter down the noise of cache fragments.


Jump to SSTableScanner, here we use the file positions from getCachedPaged to figure out if
a given row is considered "active". If it is we set the isInPageCache flag on the SSTableIdentityIterator.


Jump to CompactionManager, if any part of a row has been flagged as active then we make sure
when we write the new SSTable this rows data is not forced out of the page cache (the default
action from CASSANDRA-1470)

The two variables we probably should expose here are: 

1. BRAF.MAX_BYTES_IN_PAGE_CACHE - this says how many bytes should i let the page cache buffer
before I force a flush of the OS cache for this files working (this is currently set to 128mb
which, based on my testing is a nice default)

2. SSTableScanner's call to getCachedPages uses a minContiguousPages setting of 32.  Again
this is a nice default I've found.


By increasing (1) you pollute your page cache more but slightly increase your write performance.
By increasing (2) you migrate less and less rows during compaction.





> Migrate cached pages during compaction 
> ---------------------------------------
>
>                 Key: CASSANDRA-1902
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.1
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>             Fix For: 0.7.2
>
>         Attachments: 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt
>
>   Original Estimate: 32h
>          Time Spent: 24h
>  Remaining Estimate: 8h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a pre-compacted
CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that uses the posix
mincore() function to detect the offsets of pages for this file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the keys actually
in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os cache and
make sure the subsequent pages in the new compacted SSTable are kept in the page cache for
these keys. This will minimize the impact of compacting a "hot" SSTable.
> A simpler yet similar approach is described here: http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message