mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Tags generation?
Date Fri, 03 Aug 2012 07:43:14 GMT
tf-idf is a good approximation of the LLR score for many applications and
often gives useful signatures although not always super pretty.

It helps to have an overall minimum document frequency for terms of the be
considered for being tags.  This is the same as an IDF maximum.

On Fri, Aug 3, 2012 at 1:35 AM, Lance Norskog <> wrote:

> I'm looking for a good tags generator. A function from document/term
> matrix onto term list is a good bet, since it creates an analysis of
> the interplay of document and term. I have an LSA implementation for
> grinding on document/term matrices. This is very effective but seems
> overkill. Is there a simpler function from a document/term matrix onto
> a terms list? Maybe the mean tf-idf or log-entropy?
> --
> Lance Norskog

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message