mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Butkus <>
Subject RE: Naive Bayes Classifier as a Recommender
Date Wed, 16 Oct 2013 05:47:34 GMT
Ive been using the tfidf class to generate scores. I then use this
score to determine how good the classification is, if u need more info
say, and i can get u some code

Sent from my Windows Phone From: Pat Cunnane
Sent: 15/10/2013 23:00
Subject: Naive Bayes Classifier as a Recommender
Hi, I've got a dataset of millions of short documents (think twitter) that
can be in one of about 30,000 categories. When a user is creating a new
document, I want to suggest a list of 5 possible categories for that
document to go into.

Right now I'm using the Naive Bayes classifier in mahout and sorting the
results by score. My problem is that sometimes the recommender is not very
accurate and I'd like to know:

Is there any way to find out a confidence level for a classification?
Ideally then I could set a threshold and not display recommendations if the
classifier is not confident.

Also, would it be better to consider another algorithm to achieve my goals?
I chose Naive Bayes because my dataset is pure text and very large. Any
thoughts would be greatly appreciated.



View raw message