spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From karthikjay <aswin8...@gmail.com>
Subject [Structured Streaming] Handling Kakfa Stream messages with different JSON Schemas.
Date Thu, 01 Mar 2018 00:46:53 GMT
Hi all,

I have the following code to stream Kafka data and apply a schema called
"hbSchema" on it and then act on the data. 

val df = spark
      .readStream
      .format("kafka")
      .option("kafka.bootstrap.servers", "10.102.255.241:9092")
      .option("subscribe", "mabr_avro_json1")
      .option("failOnDataLoss", "false")
      .load()
      .selectExpr("""deserialize("avro_json1", value) AS message""")

    import spark.implicits._

    val df1 = df
      .selectExpr("cast (value as string) as json")
      .select(from_json($"message", schema=hbSchema).as("data"))
      .select("data.*")

But, what if the data in Kafka topic have different schemas ? How do I apply
different schemas based on the data ?




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message