mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From drahman <>
Subject text classification using mahout and lucene index
Date Tue, 11 Oct 2011 10:38:02 GMT
Hi everyone,

I want to use mahout for text classification. Right now I'm reading through
some chapters of the book "mahout in action", but some of the code examples
aren't working yet. So I thougt, that I ask my question right away: how can
I use Mahout for text classification?

My problem is about categorizing text. I have a list of documents
(text+abstract) and for each document I have a list of keywords (multi-label


I want to train a classifier using this information to build a recommender.
The data is available as XML and lucene-index. I'm hoping, that I can use
the existing lucene-data, if yes, than how?

Also I want to use different algorithms or combinations of algorithms (i.e.
SVM+naiveBayes), so that I can compare the results.

What I need is direction, i.e. which functions in mahout are interesting for

Thanks in advance!

PS: I got a failure notice, when I tried to subscribe to the mailing list...

View this message in context:
Sent from the Mahout User List mailing list archive at

View raw message