You need the term frequency vector.
See here
http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/index/IndexReader.html#getTermFreqVector%28int,%20java.lang.String%29
This is compatible in 3.0 as well:
http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/index/IndexReader.html#getTermFreqVector%28int,%20java.lang.String%29
Note the package change.
On Wed, Dec 16, 2009 at 7:34 AM, Antonio Calò <anton.calo@gmail.com> wrote:
> I All
>
> I Hope that you can help me on this.
>
> I'm looking for a fast way to obtainf for a given word, its term frequency
> (I mean how many times it is available in a single doc). I've looking into
> mail archive and LIA (Lucene In Action) book and I found something like
> this:
>
> IndexSearcher index = new IndexSearcher(invertedIndexinRam);
> Term term = new Term("doc", "quick");
> int occurrence = index.docFreq(term);
>
> ok, occurrence contains the occurrences of the word "quick" into the index
> (In my case the index will contain only one document example "the quick
> brown fox jumps over the lazy dog"). In this case the occurrence will be 1.
> :)
>
> But now I need to retrieve the occurrency of a composite word: as example
> "quick brown fox" but I'm quite in trouble on how could I perform this.
>
> Thanks in advance for your help.
>
> Best Regards.
>
> Antonio
>
>
>
> --
> Antonio Calò
> ------------------------------------------
> Software Developer Engineer
> @ Intellisemantic
> Mail anton.calo@gmail.com
> Tel. 011-56.90.429
> ------------------------------------------
>
--
Ted Dunning, CTO
DeepDyve
|