lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravikumar Govindarajan <ravikumar.govindara...@gmail.com>
Subject Re: TermsEnum.docFreq() returns 0
Date Mon, 13 May 2013 13:25:12 GMT
Indexing code below. Looks very simple. Is this correct?

           IndexWriterConfig conf = new
IndexWriterConfig(Version.LUCENE_42, new
StandardAnalyzer(Version.LUCENE_42));
            conf.setOpenMode(OpenMode.CREATE_OR_APPEND);
            String indexPath = "<some-file-path>";
            Directory dir=FSDirectory.open(new File(indexPath));
            writer = new IndexWriter(dir,conf);
            FieldType type = new FieldType();
            type.setTokenized(true);
            type.setIndexed(true);
 type.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
        Field field = new Field("content", "one two two three", type);
        luceneDoc.add(field);
        writer.addDocument(luceneDoc);
        writer.close();

Reading docFreq and totalTermFreq through terms-enum returns 0 and -1, for
all terms

--
Ravi


On Fri, May 10, 2013 at 10:19 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> It should not be 0, as long as TermsEnum.next() does not return null
> ... can you make a small test case?  Thanks.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, May 10, 2013 at 8:26 AM, Ravikumar Govindarajan
> <ravikumar.govindarajan@gmail.com> wrote:
> > I have to add that the above code is wrong.
> >
> > It has to be
> >
> >  while((ref=tEnum.next())!=null)
> >                     {
> >                         ref = tEnum.term();
> >                         tEnum.docFreq(); // Even here VAL=0
> >                     }
> >
> > Apologies for the mistake, but the problem remains
> >
> >
> >
> > On Fri, May 10, 2013 at 5:54 PM, Ravikumar Govindarajan <
> > ravikumar.govindarajan@gmail.com> wrote:
> >
> >> We have the following code
> >>
> >> SegmentInfos segments = new SegmentInfos();
> >>  segments.read(luceneDir);
> >>  for(SegmentInfoPerCommit sipc: segments)
> >> {
> >> String name = sipc.info.name;
> >> SegmentReader reader = new SegmentReader(sipc, 1, new IOContext());
> >> Terms terms = reader.terms("content");
> >> TermsEnum tEnum = terms.iterator(null);
> >>  tEnum.docFreq(); //VAL=0
> >>  tEnum.totalTermFreq(); //VAL=-1
> >> }
> >>
> >> The field "content" is indexed as DOCS_FREQ_AND_POSITION
> >>
> >> Why does the docFreq returned as 0 for all terms. Is this expected or
> am I
> >> doing something wrong?
> >>
> >> --
> >> Ravi
> >>
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message