spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apostolos N. Papadopoulos" <papad...@csd.auth.gr>
Subject Re: Apply Kmeans in partitions
Date Wed, 30 Jan 2019 18:51:51 GMT
Hi Dimitri,

what is the error you are getting, please specify.

Apostolos


On 30/1/19 16:30, dimitris plakas wrote:
> Hello everyone,
>
> I have a dataframe which has 5040 rows where these rows are splitted 
> in 5 groups. So i have a column called "Group_Id" which marks every 
> row with values from 0-4 depending on in which group every rows 
> belongs to. I am trying to split my dataframe to 5 partitions and 
> apply Kmeans to every partition. I have tried
>
> rdd=mydataframe.rdd.mapPartitions(function, True)
> test = Kmeans.train(rdd, num_of_centers, "random")
>
> but i get an error.
>
> How can i apply Kmeans to every partition?
>
> Thank you in advance,

-- 
Apostolos N. Papadopoulos, Associate Professor
Department of Informatics
Aristotle University of Thessaloniki
Thessaloniki, GREECE
tel: ++0030312310991918
email: papadopo@csd.auth.gr
twitter: @papadopoulos_ap
web: http://datalab.csd.auth.gr/~apostol


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message