spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anishm <anish.mashan...@gmail.com>
Subject How to add all combinations of items rated by user and difference between the ratings?
Date Sat, 28 Mar 2015 12:09:46 GMT
The input file is of format: userid, movieid, rating
>From this plan, I want to extract all possible combinations of movies and
difference between the ratings for each user.

(movie1, movie2),(rating(movie1)-rating(movie2))

This process should be processed for each user in the dataset. Finally, I
would like to find the average disagreement of movies for the user.

(movie1, movie2), average difference between ratings

How do I do the same in python?

I did write a code for Hadoop Streaming, but having a real hard time
converting it to Spark compatible code.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-all-combinations-of-items-rated-by-user-and-difference-between-the-ratings-tp22268.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message