spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chaudhary, Umesh" <Umesh.Chaudh...@searshc.com>
Subject Json Dataframe formation and Querying
Date Wed, 01 Jul 2015 11:45:48 GMT
Hi,
I am creating DataFrame from a json file and the schema of json as truely depicted by dataframe.printschema()
is:

root
|-- 1-F2: struct (nullable = true)
|    |-- A: string (nullable = true)
|    |-- B: string (nullable = true)
|    |-- C: string (nullable = true)
|-- 10-C4: struct (nullable = true)
|    |-- A: string (nullable = true)
|    |-- D: string (nullable = true)
|    |-- E: string (nullable = true)
|-- 11-B5: struct (nullable = true)
|    |-- A: string (nullable = true)
|    |-- D: string (nullable = true)
|    |-- F: string (nullable = true)
|    |-- G: string (nullable = true)

In the above schema ; struct type elements {1-F2 ; 10-C4; 11-B5 } are dynamic. These kind
of dynamic schema can be easily parsed by any parser (e.g. gson, jackson) and Map type structure
makes it easy to query back and transform but in Spark 1.4 how should I query back using construct
like :

dataframe.select([0]).show()  --> Index based query

I tried to save it as Table and then tried to describe it back using spark-sql repl but it
is unable to find my table.

What is the preferred way to deal with this type of use case in Spark?

Regards,
Umesh Chaudhary

This message, including any attachments, is the property of Sears Holdings Corporation and/or
one of its subsidiaries. It is confidential and may contain proprietary or legally privileged
information. If you are not the intended recipient, please delete it without reading the contents.
Thank you.

Mime
View raw message