commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gilles Sadowski (Jira)" <j...@apache.org>
Subject [jira] [Commented] (COLLECTIONS-728) BloomFilter contribution
Date Sat, 19 Oct 2019 16:18:00 GMT

    [ https://issues.apache.org/jira/browse/COLLECTIONS-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955205#comment-16955205
] 

Gilles Sadowski commented on COLLECTIONS-728:
---------------------------------------------

{quote}IntStream stream(); // a stream of integers that are the indexes of the enabled bits.
{quote}
As already noted, I don't have the practical knowledge to know whether this is required for
_all_ use-cases of the Bloom filter functionality. However, from a design POV, this is a kludge
(IMHO) because, as I've tried to convey on the ML, it conflates the concept ("determines,
with some probability of false positive, whether something exists") with how it is implemented
(a specific, and potentially inefficient, representation of the underlying structure that
provides one way to obtain the requested result).
{quote}I have added BloomFilterI2
{quote}
A particular example strikes me as the very illustration of what I think is the confusion
between "BloomFilter" and "set of bits":
{quote}
{code:java}
    default int hammingValue() {
        return cardinality();
    }

    default int hammingDistance( BloomFilterI2 other ) {
        return xorCardinality( other );
    }
{code}
{quote}
IIUC, {{hammingDistance}} is a core property of the Bloom filters; whereas {{xorCardinality}}
is how it can be provided, given some assumption (i.e. "BitSetI") of the underlying representation.
The latter is, in OO terminology, an implementation detail that should not percolate to the
public API. Moreover, in this case, the methods {{hammingValue}} and {{hammingDistance}} are
clearly redundant, and one would readily ask: Why don't we just drop them?

> BloomFilter contribution
> ------------------------
>
>                 Key: COLLECTIONS-728
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-728
>             Project: Commons Collections
>          Issue Type: Task
>            Reporter: Claude Warren
>            Priority: Minor
>         Attachments: BF_Func.md, BloomFilter.java, BloomFilterI2.java, Usage.md
>
>
> Contribution of BloomFilter library comprising base implementation and gated collections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message