spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomas Carini <>
Subject Struct as parameter
Date Tue, 17 Jan 2017 17:11:31 GMT
Hi all.
I have the following schema that I need to filter with spark sql

|-- msg_id: string (nullable = true)
|-- country: string (nullable = true)
|-- sent_ts: timestamp (nullable = true)
|-- marks: array (nullable = true)
|    |-- element: struct (containsNull = true)
|    |    |-- namespace: string (nullable = true)
|    |    |-- name: string (nullable = true)
|    |    |-- value: integer (nullable = true)

I'd like to write something like
sqlContext.sql("select * from df where array_contains(tags,
But I was not able to write it.

What I'm doing rith now is to create a df with one row an select if from
there. So the element matching in the array works Ok, but I was not able to
write a literal or construct a variable to pass to the sql sentence.
Any help would be very appreciatted.

case class Mark(namespace: String, name: String, value: Option[Int])
case class Marks(tag: Mark)
val mark0 = sc.parallelize(Seq(Marks(Mark("a-mark", "0", null)))).toDF
val mark1 = sc.parallelize(Seq(Marks(Mark("a-mark", "1", null)))).toDF
def withmarks = sqlContext.sql("select df.*, 0 mark from df where
array_contains(marks, (select * from mark0)) union all select df.*, 1
mark from df where array_contains(marks, (select * from mark1))")

View raw message