cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache
Date Sat, 16 Apr 2016 20:49:25 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244390#comment-15244390
] 

Benedict commented on CASSANDRA-11452:
--------------------------------------

bq.  though a rough calculation indicates it isn't a huge savings

Assuming a basic CLHM, with CompressedOops on a 64-bit VM (Cassandra's defaults) I calculate
overhead inflation of around 22% - I reckon 72 bytes are needed vs a possible 56 (once the
12 byte overheads are removed, and alignment accounted for).  You'd also be able to avoid
recalculating the hash for the sketches since its memoized in CHM.  Admittedly I don't 100%
vouch for the accuracy of those calculations as I'm doing it from memory.

I absolutely am not suggesting your calculation of cost/benefit is wrong though, or that I
would even have arrived at a different conclusion.  Certainly the user key/value sizes further
amortize that overhead inflation, and for many workloads the distinction is barely perceptible.

bq. What do you think about combining the approach

I assume you mean the inversion of that guard.  It's a shame we don't have access to the CHM
to do the sampling, as that would make it robust to scans since all the members of the LRU
would have high frequencies.  My only slight concern is that we may have to wait 10s of thousands
of rejections to cycle out the collision, which is quite slow to respond. By raising the chance
we harm scans though.  A couple of other options:

# Randomly sample the frequency of, say, 1% of the items we admit (on admission, storing the
last 16 or so), on demand compute the low quartile
# On demand, sample a random short run of the sketch when we encounter this situation, compute
some percentile (need some thought for which)

Then either for 1% of admissions, or when your current guard is triggered, compute this statistic
for the guard.  For absolute security, for say 0.01% of candidates, admit without any check.

That all said, I expect for Cassandra's purposes many of the proposed solutions so far will
be sufficient, and I certainly wouldn't have any problem with the solution you propose.

> Cache implementation using LIRS eviction for in-process page cache
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-11452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths
>            Reporter: Branimir Lambov
>            Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid having to
explicitly marking compaction accesses as non-cacheable, we need a cache implementation that
uses an eviction algorithm that can better handle non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message