lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Beady Geraghty <beadygerag...@gmail.com>
Subject Re: need some advice/help with negative query.
Date Fri, 06 Jan 2006 21:14:55 GMT
I thought along the line of dummy:all as well, but for some reason,
I chose bitset.  I am not sure if it matters which route to go.

My situation is that for now, I have say 10000 documents,
but, maybe I incease that by 100 time or more.  I would guess that
half of the documents qualifies, and half don't.

It appears that everyone  suggests that I take  MatchAllDocsQuery
from the trunk. Is this the choice regardless of the number of
documents I have ?

Thanks,






On 1/6/06, Beady Geraghty <beadygeraghty@gmail.com> wrote:
>
> Thank you all for your answer.
>
> On 1/6/06, Yonik Seeley <yseeley@gmail.com> wrote:
> >
> > Should we should detect the case of all negative clauses and throw in
> > a MatchAllDocsQuery?
> >
> > I guess this would be done in the QueryParser, but one could also make
> > a case for doing it in the BooleanQuery.
> >
> > -Yonik
> >
> > On 1/6/06, Erik Hatcher <erik@ehatchersolutions.com> wrote:
> > > With Lucene's trunk, there is a MatchAllDocsQuery.   You could use
> > > this in a BooleanQuery with your negative-only query.
> > >
> > > Another option, if you're at Lucene 1.4.3 is to index the same value
> > > for a dummy field for every document (say like "dummy:all") and use a
> > > TermQuery in a BooleanQuery with the negative-only query.
> > >
> > > As for BitSet's, if you need to go that route a QueryFilter would
> > > give you the BitSet back that you could easily complement, but that
> > > might be a bit overkill for what you need given the option above.
> > >
> > >         Erik
> > >
> > >
> > > On Jan 6, 2006, at 12:04 PM, Beady Geraghty wrote:
> > >
> > > > I would like to do queries that are negative. I mean a query with
> > > > only negative terms and phrases.  For example, retrieve all
> > > > documents that do not contain the term "apple".
> > > >
> > > > For now, I have a limited set of documents (say, 10000) to index.
> > > > I can create a bitset that represents the search result of hits on
> > > > "apple".
> > > > Then I complement (XOR) the result.
> > > > Each bit corresponds to a document ID.
> > > > My question is :
> > > > Inside Lucene, are the hits represented in some form of a bitset.
> > > > Can I get at it directly.   I saw the BitSet class.  (I now use
> > > > Java's Bitset class).
> > > > Assuming that hits are internally represented as bitset, for a
> > > > small number of documets, the bitset won't be very big,
> > > > and if there are plenty of hits and many many more documents,
> > > > is the bitset still  kept entirely
> > > > in memory as well ?
> > > >
> > > > Thank you
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message