mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Noel <>
Subject Clustering raw articles vs clustering (Stanford's) NER output
Date Mon, 12 May 2014 13:29:33 GMT
I've spent a few weeks tuning Mahout to cluster news articles and have
had decent results. Decent, but still not perfect. In trying to think
of ways to improve my results I had the idea of running Mahout on
output from Stanford's Named Entity Recognizer (NER) instead of the
articles themselves, and seeing how that compared. Has anyone tried
this? Did it generate more cohesive clusters?

View raw message