lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carsten Schnober <schno...@ids-mannheim.de>
Subject No documents in TermsFilter.getDocIdSet()
Date Mon, 15 Apr 2013 16:33:05 GMT
Hi,
tying in with the previous thread "Statically store sub-collections for
search", I'm trying to focus on the root of the problem that has
occurred to me.

At first, I generate a TermsFilter with potentially many terms in one term:

-----------------------------------------
List<Term> docnames = new ArrayList<>(resource.getDocIDs().size());
for (String docid : resource.getDocIDs()) {
  docnames.add(new Term("id", docid));
}
TermsFilter filter = new TermsFilter(docnames);
-----------------------------------------

This filter is used to generate a DocIdSet object holding the allowable
documents in a loop over the atomic segments of my IndexReader reader:

-----------------------------------------
for (AtomicReaderContext atomic : reader.leaves()) {
  DocIdSet docids = filter.getDocIdSet(atomic,
atomic.reader().getLiveDocs());
  DocIdSetIterator iterator = docids.iterator();
  while (iterator.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
    ...
  }
  ...
}
-----------------------------------------

The while-loop is never entered, i.e. there are no documents in docids.
However, it does return a DocIdSetIterator object and is not null. The
same technique works fine with another Filter (a QueryWrapperFilter). Is
this a bug or am I addressing the TermsFilter (or the resuling DocIdSet)
in the wrong way? Are there any working examples for how to get a
properly populated DocIdSet from a TermsFilter?

I read that the iterator() method has to be implemented for every
DocIdSet implementation. Also, TermsFilter.getDocIdSet() seems to return
null or a FixedBitSet which seems to implement its iterator() by an
OpenBitSetIterator.

Best,
Carsten

-- 
Institut für Deutsche Sprache | http://www.ids-mannheim.de
Projekt KorAP                 | http://korap.ids-mannheim.de
Tel. +49-(0)621-43740789      | schnober@ids-mannheim.de
Korpusanalyseplattform der nächsten Generation
Next Generation Corpus Analysis Platform

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message