lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe R <>
Subject Re: search binning support
Date Wed, 11 Oct 2006 13:55:30 GMT

We faced what might be a similar problem not too long ago.  Our app is supposed
to allow for foldering -- i.e., a document may be in one or more folders that
the user creates and populates by hand or via query.  We used a simple btree
database from Berkeley JE and used a hit collector to filter against that
database when selecting results.  We didn't go with an all-Lucene approach
because the "foldering" is supposed to be responsive (the user should see the
document in the folder within ~5 seconds) and we have large catalog sizes; in
other words, we didn't want to modify and re-optimize the index very often. 
This also allowed us to do our own "per-field" stored field implementation:
another Berkeley DB holds all our stored fields and the Lucene index only
stores a single, small, non-Lucene document ID.  We pull only the small
document ID for the hit collector and only those fields needed for the results
from Berkeley.


--- Yu-Hui Jin <> wrote:

> Say I have N categories, each item is assigned to one or more categories.
> And i want the search results being counted against each of the categories.
> I checked the Lucene in Action book, and there doesn't seem to be this
> feature. So is there any plan to add binning to Lucene?
> It looks like this involves modifying part of the Lucene's implementation,
> in that, we can:
> - specify which index field is used as the binning field.
> - after we grab the doc-id list, we perform N intersections just to get the
> count:  each intersection is performed on the result doc-id list and the
> doc-id list for all items assigned to a category.
> Is there any better approach to do that? or any optimizations to this?
> thanks,
> -Hui

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message