spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shashank Mandil <mandil.shash...@gmail.com>
Subject Re: Fwd: Need some help
Date Thu, 01 Sep 2016 20:18:01 GMT
Hi Aakash,

I think what it generally means that you have to use the general spark APIs
of Dataframe to bring in the data and crunch the numbers, however you
cannot use the KMeansClustering algorithm which is already present in the
MLlib spark library.

I think a good place to start would be understanding what the KMeans
clustering algorithm is and then looking into how you can use the DataFrame
API to implement the KMeansClustering.

Thanks,
Shashank

On Thu, Sep 1, 2016 at 1:05 PM, Aakash Basu <aakash.spark.raj@gmail.com>
wrote:

> Hey Siva,
>
> It needs to be done with Spark, without the use of any Spark libraries.
> Need some help in this.
>
> Thanks,
> Aakash.
>
> On Fri, Sep 2, 2016 at 1:25 AM, Sivakumaran S <siva.kumaran@icloud.com>
> wrote:
>
>> If you are to do it without Spark, you are asking at the wrong place. Try
>> Python + scikit-learn. Or R. If you want to do it with a UI based software,
>> try Weka or Orange.
>>
>> Regards,
>>
>> Sivakumaran S
>>
>> On 1 Sep 2016 8:42 p.m., Aakash Basu <aakash.spark.raj@gmail.com> wrote:
>>
>>
>> ---------- Forwarded message ----------
>> From: *Aakash Basu* <aakash.spark.raj@gmail.com>
>> Date: Thu, Aug 25, 2016 at 10:06 PM
>> Subject: Need some help
>> To: user@spark.apache.org
>>
>>
>> Hi all,
>>
>> Aakash here, need a little help in KMeans clustering.
>>
>> This is needed to be done:
>>
>> "Implement Kmeans Clustering Algorithm without using the libraries of
>> Spark. You're given a txt file with object ids and features from which you
>> have to use the features as your data points. This will be a part of the
>> code itself"
>>
>> PFA the file with ObjectIDs and features. Now how to go ahead and work on
>> it?
>>
>> Thanks,
>> Aakash.
>>
>>
>>
>

Mime
View raw message