lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: TermsEnum.docFreq() returns 0
Date Mon, 13 May 2013 15:49:56 GMT
That code looks correct.

But can you tie it all together into a runnable test case?  Ie add in
the terms enum, calling docFreq and getting 0 when it should be 1.

Also, if you run CheckIndex on the index produced by the code below,
how many terms/freqs/positions does it report?

Mike McCandless

http://blog.mikemccandless.com


On Mon, May 13, 2013 at 9:25 AM, Ravikumar Govindarajan
<ravikumar.govindarajan@gmail.com> wrote:
> Indexing code below. Looks very simple. Is this correct?
>
>            IndexWriterConfig conf = new
> IndexWriterConfig(Version.LUCENE_42, new
> StandardAnalyzer(Version.LUCENE_42));
>             conf.setOpenMode(OpenMode.CREATE_OR_APPEND);
>             String indexPath = "<some-file-path>";
>             Directory dir=FSDirectory.open(new File(indexPath));
>             writer = new IndexWriter(dir,conf);
>             FieldType type = new FieldType();
>             type.setTokenized(true);
>             type.setIndexed(true);
>  type.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
>         Field field = new Field("content", "one two two three", type);
>         luceneDoc.add(field);
>         writer.addDocument(luceneDoc);
>         writer.close();
>
> Reading docFreq and totalTermFreq through terms-enum returns 0 and -1, for
> all terms
>
> --
> Ravi
>
>
> On Fri, May 10, 2013 at 10:19 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> It should not be 0, as long as TermsEnum.next() does not return null
>> ... can you make a small test case?  Thanks.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Fri, May 10, 2013 at 8:26 AM, Ravikumar Govindarajan
>> <ravikumar.govindarajan@gmail.com> wrote:
>> > I have to add that the above code is wrong.
>> >
>> > It has to be
>> >
>> >  while((ref=tEnum.next())!=null)
>> >                     {
>> >                         ref = tEnum.term();
>> >                         tEnum.docFreq(); // Even here VAL=0
>> >                     }
>> >
>> > Apologies for the mistake, but the problem remains
>> >
>> >
>> >
>> > On Fri, May 10, 2013 at 5:54 PM, Ravikumar Govindarajan <
>> > ravikumar.govindarajan@gmail.com> wrote:
>> >
>> >> We have the following code
>> >>
>> >> SegmentInfos segments = new SegmentInfos();
>> >>  segments.read(luceneDir);
>> >>  for(SegmentInfoPerCommit sipc: segments)
>> >> {
>> >> String name = sipc.info.name;
>> >> SegmentReader reader = new SegmentReader(sipc, 1, new IOContext());
>> >> Terms terms = reader.terms("content");
>> >> TermsEnum tEnum = terms.iterator(null);
>> >>  tEnum.docFreq(); //VAL=0
>> >>  tEnum.totalTermFreq(); //VAL=-1
>> >> }
>> >>
>> >> The field "content" is indexed as DOCS_FREQ_AND_POSITION
>> >>
>> >> Why does the docFreq returned as 0 for all terms. Is this expected or
>> am I
>> >> doing something wrong?
>> >>
>> >> --
>> >> Ravi
>> >>
>> >>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message