lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: term collection frequence in lucene 3.6.2?
Date Tue, 03 Sep 2013 10:55:27 GMT
3.6.x doesn't track this statistic, but 4.x does: TermsEnum.totalTermFreq().

In 3.6.x you could visit every doc, summing up the .freq(), but this is slowish.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Sep 3, 2013 at 4:19 AM, jiangwen jiang <jiangwen127@gmail.com> wrote:
> Hi, gay.
>
> Term collection frequence (which means how many times a particular term
> appears in all documents), is this data exists in lucene 3.6.2.
>
> for example:
> doc1 contains terms: T1 T2 T3 T1 T1
> doc2 contains Term T1 T4 T4
> ....
>
> T1 appears 4 times in all documents, so term collection freq of T1 is 4
>
> Thanks for your help
> Regards

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message