mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <>
Subject Part 2 blog post on extracting text features
Date Mon, 22 Jul 2013 02:57:34 GMT
Hi Mahouters,

I just posted part 2 of a series on extracting text features for machine learning…

The top five terms (by LLR score) in emails written by Ted are now u_k, v_k, sgd, regress,
and categori. Which is way better than the very first results (see previous blog post), which
were v3, 3, v2, q, and 0.00000


-- Ken

Ken Krugler
+1 530-210-6378
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr

View raw message