lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Text Categorization with Lucene (N-Gram technique)
Date Tue, 26 Jul 2011 15:36:01 GMT
Lucene has support for ngrams during indexing and querying.  The rest would have to be done
for you.  

<shamelessPlug>Taming Text chapter 7 has some basic implementations using Lucene to
do categorization.</shamelessPlug>


On Jul 24, 2011, at 12:38 PM, Saurabh Gokhale wrote:

> Hi All,
> I need to work on the application where I have to categorize text (group of
> sentences) into multiple pre-defined categories.
> As I understand from the searches on the internet, theoretically it is
> possible with Ngram based index and matching the incoming text n-gram with
> the known fingerprint of the category.
> I wanted to know if Lucene already has any contribution done in this regards
> that I can find in the contrib directory or is there any example that I can
> look at else where.
> Saurabh

Grant Ingersoll

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message