lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Faceted Browsing questions
Date Tue, 27 Jun 2006 02:56:55 GMT

: It may not even be necessary to cache this type of lookup since it is
: simply a TermEnum through specific fields in the index.  Maybe simply
: doing the TermEnum in the request handler instead of iterating
: through a cache would be just as fast or faster.  Any thoughts on that?

While commuting I've been letting my brain bounce arround various ideas
for a completley generic totally reusable faceting request handler, and
I've been mulling over teh same question ... my current theory is that it
might make sense to cache a bounded Priority queue of the Terms for each
faceting field where the priority is determined by the docFreq, and the
size is configurable.  that way you can start with the values in the
queue and if/when you reach a point where the docFreq of the next item in
the queue is less then the lowest intersection count you've found so far,
and you already have as many items as you want to display, you don't have
to bother checking all of the other values (and you don't have to bother
with the TermEnum unless you completely exhaust the queue)

: My next challenge is to re-implement the catch-all facets that I used
: to do by unioning all documents in an (Open)BitSet and inverting it.
: How can I invert a DocSet (I realize I gat get the bits and do it
: that way, but is there a better way)?

well, the most obvious solution i can think of would be a patch adding an
invert() method to DocSet, HashDocSet and BitDocSet.   :)

there was some discussion about this on the list previously if i recall
correctly.


-Hoss


Mime
View raw message