mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <grant.ingers...@gmail.com>
Subject Re: Clustering Question
Date Wed, 06 Apr 2011 01:40:20 GMT
What commands are you running to do the actual clustering?


On Apr 3, 2011, at 4:27 AM, sarath pr wrote:

> SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, new
> Path(inputDir,"documents.seq"),Text.class, Text.class);
> 
>     for(int i=0;i<s.length;i++)
>        {
> 
>             writer.append(new Text(s[i][0]), new Text(s[i][1]));
>         }
>      writer.close();
> 
> Here Text(s[i][0]) is a string value, which is the ID of a news
> article and Text(s[i][1]) is the news article text . I have clustered
> some 100+ news articles like this and i get the output in
> clusteredPoints/part-m-00000. My question is that is it possible to
> extract the article ID (ie Texts[i][0]), which i had appended) and
> corresponding cluster id from the part-m-00000 file.
> 
> Anyone knows ???
> 
> -- 
> Thank You..!!
> Sarath Ramachandran
> sarath.amrita@gmail.com
> +919995024287

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/


Mime
View raw message