mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <j...@windwardsolutions.com>
Subject Re: intial centriods for fuzzy k means algorithm
Date Mon, 04 Feb 2013 16:36:13 GMT
Clusters have a constructor that accepts a vector that you can use for this.

On 2/2/13 2:17 PM, sri krishna wrote:
>
> I checked the source code for usage of ClusterWritables to write centriods to a
> sequence file i found out this
>
> SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, path,
> Text.class, ClusterWritable.class);
> ClusterWritable clusterWritable = new ClusterWritable();
>
> clusterWritable.setValue(canopy);
> writer.append(new Text(canopy.getIdentifier()), clusterWritable);
>
> heresetValue  expects Cluster, what is the way i can convert a raw
> vector(SequentialAccessSparseVector) to Cluster type ?
>
>
>
> ________________________________
>   From: Jeff Eastman <jdog@windwardsolutions.com>
> To: user@mahout.apache.org
> Sent: Saturday, 2 February 2013 12:18 AM
> Subject: Re: intial centriods for fuzzy k means algorithm
>   
> If you don't specify a -k value but specify a -ci directory that
> contains clusters you want to use for the prior then the ClusterIterator
> will use them for kmeans and fuzzyk. You will need to create one or more
> sequence files containing ClusterWritables to do this.
>
> On 2/1/13 9:08 AM, sri krishna wrote:
>> my question was more like how can i generate new centriods based on the predefined
points i give, as in my case i know the number of clusters and also few points in each of
the cluster.
>>
>>    
>>
>>
>>
>>
>> ________________________________
>>     From: Rajesh Nikam <rajeshnikam@gmail.com>
>> To: user@mahout.apache.org; sri krishna <krishnainet@yahoo.com>
>> Sent: Friday, 1 February 2013 5:07 PM
>> Subject: Re: intial centriods for fuzzy k means algorithm
>>    
>> you could use canopy clustering from mahout to initialize centroids.
>>
>> Thanks
>> Rajesh
>>
>>
>> On Fri, Feb 1, 2013 at 4:43 PM, sri krishna <krishnainet@yahoo.com> wrote:
>>
>>> Hi,
>>>
>>>
>>> I have sample set of few documents of each cluster(no of clusters known
>>> and also few documents in each cluster are known in advance). How to
>>> initialize the centriods with known documents, so that algorithm runs using
>>> the given data points as centriods in mahout ?


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message