mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rennie <jren...@gmail.com>
Subject Re: Papers on text clustering
Date Wed, 11 Feb 2009 21:58:40 GMT
This might be a good starting point on modern methods:

http://www.cs.princeton.edu/~blei/papers/BleiLafferty2009.pdf

Blei is one of the premier researchers in this area.  Looks like he has lots
of useful info on his home page:

http://www.cs.princeton.edu/~blei/

Cheers,

Jason

On Wed, Feb 11, 2009 at 4:13 PM, Grant Ingersoll <gsingers@apache.org>wrote:

> I've read a number of papers on it, was just looking for items that people
> recommend as a way to, potentially, round out my knowledge of the different
> approaches.
>
> I've got the Data Mining book and the Foundations book, so will refresh my
> memory on those as well
>
>
>
> On Feb 11, 2009, at 12:39 PM, Isabel Drost wrote:
>
>  On Wednesday 11 February 2009, Grant Ingersoll wrote:
>>
>>> I'm looking for papers that you recommend on text clustering (I can,
>>> of course, go search for them, but I'd like recommendations).  New,
>>> old, doesn't matter.  Either send them here or add them to the wiki at
>>> http://cwiki.apache.org/confluence/display/MAHOUT/Reference+Reading
>>>
>>
>> Hmm, I know a few books that also cover the topic of clustering texts -
>> maybe
>> one of these would be a good starting point.
>>
>> I like the book "Introduction to Information Retrieval" by Manning,
>> Raghavan
>> and Sch├╝tze. It also contains some chapters on the topic.
>>
>> "Data Mining" from Witten and Frank has a chapter on the topic.
>>
>> "Foundations of Statistical Natural Language Processing" has a chapter as
>> well.
>>
>> Are you looking for something in particular?
>>
>> Isabel
>>
>>
>> --
>> Check it out, send me comments, and dance joyously in the streets,
>>        -- Linus
>> Torvalds announcing 2.0.27
>>  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
>>  /,`.-'`'    -.  ;-;;,_
>> |,4-  ) )-,_..;\ (  `'-'
>> '---''(_/--'  `-'\_) (fL)  IM:  <xmpp://MaineC.@spaceboyz.net>
>>
>
>


-- 
Jason Rennie
Research Scientist, ITA Software
http://www.itasoftware.com/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message