spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From maheshtwc <>
Subject Spark streaming RDDs to Parquet records
Date Tue, 17 Jun 2014 19:52:32 GMT

Is there an easy way to convert RDDs within a DStream into Parquet records?
Here is some incomplete pseudo code:

// Create streaming context
val ssc = new StreamingContext(...)

// Obtain a DStream of events
val ds = KafkaUtils.createStream(...)

// Get Spark context to get to the SQL context
val sc = ds.context.sparkContext

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

// For each RDD
ds.foreachRDD((rdd: RDD[Array[Byte]]) => {

    // What do I do next?


View this message in context:
Sent from the Apache Spark User List mailing list archive at

View raw message