cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksei Zotov (Jira)" <>
Subject [jira] [Commented] (CASSANDRA-12922) Bloom filter miss counts are not measured correctly
Date Mon, 12 Jul 2021 22:46:00 GMT


Aleksei Zotov commented on CASSANDRA-12922:


I know you've already confirmed that the patch looks good, but I want to confirm one use case
is valid. I analyzed the code and I see the following valid exit paths for {{getPosition}}

||Use Case||Behavior||
|key is not present in NF|addTrueNegative and exit|
|key is present in Key Cache|addTruePositive and exit|
|key is not within sstable's keys range|addFalsePositive and exit|
|there is no index file|exit|
|key is not present in index file|*addFalsePositive (that's what we're fixing)* and exit|
|key is present in index file|addTruePositive and exit|
|else|addFalsePositive and exit|

The question is: don't we need to track "false positive" if there is no index file? I know
that having no index file is not smth expected, but from BF perspective, I see no difference
between "key is not present in index file" and "there is no index file" use cases. Please,
let me know your thoughts.

cc: [~blerer]



> Bloom filter miss counts are not measured correctly
> ---------------------------------------------------
>                 Key: CASSANDRA-12922
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Local Write-Read Paths
>            Reporter: Branimir Lambov
>            Assignee: Benjamin Lerer
>            Priority: Normal
>              Labels: lhf
>             Fix For: 4.x
>         Attachments: 12922-trunk.txt
> Bloom filter hits and misses are evaluated incorrectly in {{BigTableReader.getPosition}}:
we properly record hits, but not misses. In particular, if we don't find a match for a key
in the index, which is where almost all non-matches will be rejected, [we don't record a bloom
filter false positive|].
> This leads to very misleading output from e.g. {{nodetool tablestats}}.

This message was sent by Atlassian Jira

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message