spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jaykatukuri <jkatuk...@apple.com>
Subject RDD to DataFrame for using ALS under org.apache.spark.ml.recommendation.ALS
Date Mon, 16 Mar 2015 16:08:33 GMT
Hi all,
I am trying to use the new ALS implementation under
org.apache.spark.ml.recommendation.ALS.



The new method to invoke for training seems to be  override def fit(dataset:
DataFrame, paramMap: ParamMap): ALSModel.

How do I create a dataframe object from ratings data set that is on hdfs ?


where as the method in the old ALS implementation under
org.apache.spark.mllib.recommendation.ALS was 
 def train(
      ratings: RDD[Rating],
      rank: Int,
      iterations: Int,
      lambda: Double,
      blocks: Int,
      seed: Long
    ): MatrixFactorizationModel

My code to run the old ALS train method is as below:

 "val sc = new SparkContext(conf) 
     
     val pfile = args(0)
     val purchase=sc.textFile(pfile)
    val ratings = purchase.map(_.split(',') match { case Array(user, item,
rate) =>
    	Rating(user.toInt, item.toInt, rate.toInt)
    })

val model = ALS.train(ratings, rank, numIterations, 0.01)"


Now, for the new ALS fit method, I am trying to use the below code to run,
but getting a compilation error:

val als = new ALS()
       .setRank(rank)
      .setRegParam(regParam)
      .setImplicitPrefs(implicitPrefs)
      .setNumUserBlocks(numUserBlocks)
      .setNumItemBlocks(numItemBlocks)

val sc = new SparkContext(conf) 
     
     val pfile = args(0)
     val purchase=sc.textFile(pfile)
    val ratings = purchase.map(_.split(',') match { case Array(user, item,
rate) =>
    	Rating(user.toInt, item.toInt, rate.toInt)
    })

val model = als.fit(ratings.toDF())

I get an error that the method toDF() is not a member of
org.apache.spark.rdd.RDD[org.apache.spark.ml.recommendation.ALS.Rating[Int]].

Appreciate the help !

Thanks,
Jay






--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/RDD-to-DataFrame-for-using-ALS-under-org-apache-spark-ml-recommendation-ALS-tp22083.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message