lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: Indexing Non-Textual Data
Date Thu, 07 Apr 2011 05:04:58 GMT
Hi Chris,

Yes, people have done classification with Lucene before.  Have a look at for some discussions 
and actual code (in old JIRA issues)

Sematext :: :: Solr - Lucene - Nutch
Lucene ecosystem search ::

----- Original Message ----
> From: Chris Spencer <>
> To:
> Sent: Wed, April 6, 2011 7:46:45 PM
> Subject: Indexing Non-Textual Data
> Hi,
> I'm new to Lucene, so forgive me if this is a newbie question. I have  a
> dataset composed of several thousand lists of 128 integer features,  each
> list associated with a class label. Would it be possible to use Lucene  as a
> classifier, by indexing the label with respect to these integer  features,
> and then classify a new list by finding the most similar labels  with Lucene?
> I'm specifically interested in doing so through the PyLucene  API, so I've
> been going through the PyLucene samples, but they only seem to  involve
> indexing text, not continuous features (understandably). Could anyone  point
> me to an example that indexes non-textual data?
> I think the  project Lire ( is using
> Lucene to do something  similar to this, although with an emphasis on image
> features. I've dug into  their code a little, but I'm not a strong Java
> programmer, so I'm not sure  how they're pulling it off, nor how I might
> translate this into the PyLucene  API. In your opinion, is this a practical
> use of  Lucene?
> Regards,
> Chris

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message