spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Divya Gehlot <divya.htco...@gmail.com>
Subject [Error :] RDD TO Dataframe Spark Streaming
Date Thu, 01 Feb 2018 07:26:48 GMT
Hi,
I am getting below error when creating Dataframe from twitter Streaming RDD

val sparkSession:SparkSession = SparkSession
                        .builder
                        .appName("twittertest2")
                        .master("local[*]")
                        .enableHiveSupport()
                        .getOrCreate()
val sc = sparkSession.sparkContext
val ssc = new StreamingContext(sc, Seconds(2))
val tweets = TwitterUtils.createStream(ssc, None)
val twt = tweets.window(Seconds(60))

case class Tweet(createdAt:Long, text:String)

import org.apache.spark.sql.types._
import sparkSession.implicits._
def row(line: List[String]): Row = Row(line(0).toLong, line(1).toString)

val schema =
  StructType(
    StructField("createdAT", LongType, false) ::
      StructField("Text", StringType, true) :: Nil)


     twt.map(status=>
             Tweet(status.getCreatedAt().getTime()/1000, status.getText())
).foreachRDD(rdd=>

    rdd.toDF()
)


Error :
Error:(106, 15) value toDF is not a member of
org.apache.spark.rdd.RDD[Tweet]
          rdd.toDF()

So much confusion in Spark 2 regarding the Spark Session :(

Appreciate the help!

Thanks,
Divya

Mime
View raw message