lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Serebrennikov <>
Subject Re: incorrect OO in lucene source?
Date Tue, 20 Apr 2004 18:04:59 GMT
Doug Cutting wrote:

> Robert Engels wrote:
>> Lucene is often cited as an excellent example of OO design.
> That is kind, but the primary goal of Lucene is to provide 
> functionality, not to use "correct" OO design.  The two are not always 
> in accord. 

Hear, hear!

>> Shouldn't 'Filter' just be an interface, with the method
>> boolean filter(int docnum);
>> as to whether or not the document should be filtered?
> How would this make the world a better place? 

There is actually a point potentially worth discussing on this. There 
are two issues with the current implementation of the Filter that could 
be helped by a similar change. One is the fact that it uses 
java.util.BitSet, which in my experience is needlessly slow (I think it 
has to do with the array bounds checking and the code for dynamic 
resizing). In intensive search applications, it might be possible and 
useful to have a faster implementation, if only Filter allowed it.

The second point is that in other applications, it might be desirable to 
have a compressed implementation of a Filter - applications where number 
of documents is very large and a lot of different filters must be 
maintained in memory. The filter bitset should compress really well even 
with something as simple as run length encoding, if only Filter allowed it.

This isn't directly related to the OO discussion, I just used the 
opportunity to bring up something I've been thinking about. As to the 
method name "boolean filter(int docnum)", I think I would prefer 
"boolean include(int docnum)", just for clarity of semantics and to keep 
the same boolean sense as the current Filter interface.


> Doug
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message