spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ewan Leith <ewan.le...@realitymine.com>
Subject RE: Converting a DStream to schemaRDD
Date Tue, 29 Sep 2015 14:09:25 GMT
Something like:

dstream.foreachRDD { rdd =>
  val df =  sqlContext.read.json(rdd)
  df.select(…)
}

https://spark.apache.org/docs/latest/streaming-programming-guide.html#output-operations-on-dstreams


Might be the place to start, it’ll convert each batch of dstream into an RDD then let you
work it as if it were a standard RDD dataset.

Ewan


From: Daniel Haviv [mailto:daniel.haviv@veracity-group.com]
Sent: 29 September 2015 15:03
To: user <user@spark.apache.org>
Subject: Converting a DStream to schemaRDD

Hi,
I have a DStream which is a stream of RDD[String].

How can I pass a DStream to sqlContext.jsonRDD and work with it as a DF ?

Thank you.
Daniel

Mime
View raw message