cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Manes (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-975) explore upgrading CLHM
Date Fri, 11 Jun 2010 20:10:29 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877948#action_12877948
] 

Ben Manes commented on CASSANDRA-975:
-------------------------------------

I've been thinking over the test and I don't think that the performance numbers should be
given much weight. It is an excellent stress test, though, by showing that the interleaving
of operations does not cause a system failure.

As a performance test, it primarily indicates that there is no added bottleneck. Writing a
robust benchmark is extremely difficult, especially on the JVM. For now I've been punting
by using JBoss' benchmark, but I haven't evaluated its correctness to determine how valid
it is. At some point I should write my own, but that's a non-trivial undertaking. JBoss' test
only show the per-operation overhead, which are nearly equivalent to a ConcurrentHashMap (which
I decorate). It does not take into account the system performance due to a cache miss, so
a higher hit rate but slower per-operation execution may result in much better system performance.

Example concerns with this test from a benchmark perspective include:
(1) Test environment does not reflect production (laptop)
(2) JVM parameters and OS are not turned (e.g. GC algorithm)
(3) Short-running test does not show if there is degradation/failures over time
(4) Working set does not reflect production usage (random - should use trace data)
(5) Impact of hit-rate has a dramatic impact (miss penalty = I/O), so the test may artificially
benefit an eviction policy.

It would be nice to see what the hit-rate is between the two implementations. I suspect that
in Matthew's test the SecondChance is a tad better, so the fewer I/O calls to a slow laptop
disk can account for much of the difference. If the LIRS was stable, it would probably indicate
much faster system performance due to having a 5-10% higher hit-rate.

> explore upgrading CLHM
> ----------------------
>
>                 Key: CASSANDRA-975
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-975
>             Project: Cassandra
>          Issue Type: Task
>            Reporter: Jonathan Ellis
>            Assignee: Matthew F. Dennis
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: 0001-trunk-975.patch, clhm_test_results.txt, insertarator.py, readarator.py
>
>
> The new version should be substantially better "on large caches where many entries were
readon large caches where many entries were read," which is exactly what you see in our row
and key caches.
> http://code.google.com/p/concurrentlinkedhashmap/issues/detail?id=9
> Hopefully we can get Digg to help test, since they could reliably break CLHM when it
was buggy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message