spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From satyajit vegesna <>
Subject Issue in using DenseVector in RowMatrix, error could be due to ml and mllib package changes
Date Fri, 09 Dec 2016 02:42:35 GMT
Hi All,

PFB code.

import{HashingTF, IDF}
import org.apache.spark.mllib.linalg.distributed.RowMatrix
import org.apache.spark.sql.SparkSession
import org.apache.spark.{SparkConf, SparkContext}

  * Created by satyajit on 12/7/16.
object DIMSUMusingtf extends App {

  val conf = new SparkConf()
  val sc = new SparkContext(conf)
  val spark = SparkSession


  val sentenceData = spark.createDataFrame(Seq(
    (0, "Hi I heard about Spark"),
    (0, "I wish Java could use case classes"),
    (1, "Logistic regression models are neat")
  )).toDF("label", "sentence")

  val tokenizer = new Tokenizer().setInputCol("sentence").setOutputCol("words")

  val wordsData = tokenizer.transform(sentenceData)

  val hashingTF = new HashingTF()

  val featurizedData = hashingTF.transform(wordsData)

  val idf = new IDF().setInputCol("rawFeatures").setOutputCol("features")
  val idfModel =
  val rescaledData = idfModel.transform(featurizedData)"features", "label").take(3).foreach(println)
  val check ="features")

  val row = => row.getAs[SparseVector]("features"))

  val mat = new RowMatrix(row) //i am basically trying to use
Dense.vector as a direct input to

rowMatrix, but i get an error that RowMatrix Cannot resolve constructor


Any help would be appreciated.


View raw message