spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bharathi Raja <raja...@yahoo.com.INVALID>
Subject RE: How to Parse & flatten JSON object in a text file using Spark&Scala into Dataframe
Date Thu, 24 Dec 2015 12:06:09 GMT
Thanks Eran, I'll check the solution.

Regards,
Raja

-----Original Message-----
From: "Eran Witkon" <eranwitkon@gmail.com>
Sent: ‎12/‎24/‎2015 4:07 PM
To: "Bharathi Raja" <rajakbv@yahoo.com>; "Gokula Krishnan D" <email2dgk@gmail.com>
Cc: "user@spark.apache.org" <user@spark.apache.org>
Subject: Re: How to Parse & flatten JSON object in a text file using Spark&Scala into
Dataframe

raja! I found the answer to your question! 
Look at http://stackoverflow.com/questions/34069282/how-to-query-json-data-column-using-spark-dataframes
this is what you (and I) was looking for.
general idea - you read the list as text where project Details is just a string field and
then you build the JSON string representation of the whole line and you have a nested JSON
schema which SparkSQL can read.


Eran


On Thu, Dec 24, 2015 at 10:26 AM Eran Witkon <eranwitkon@gmail.com> wrote:

I don't have the exact answer for you but I would look for something using explode method
on DataFrame  


On Thu, Dec 24, 2015 at 7:34 AM Bharathi Raja <rajakbv@yahoo.com> wrote:

Thanks Gokul, but the file I have had the same format as I have mentioned. First two columns
are not in Json format.

Thanks,
Raja


From: Gokula Krishnan D
Sent: ‎12/‎24/‎2015 2:44 AM
To: Eran Witkon
Cc: raja kbv; user@spark.apache.org

Subject: Re: How to Parse & flatten JSON object in a text file using Spark &Scala
into Dataframe


You can try this .. But slightly modified the  input structure since first two columns were
not in Json format. 






Thanks & Regards, 
Gokula Krishnan (Gokul)


On Wed, Dec 23, 2015 at 9:46 AM, Eran Witkon <eranwitkon@gmail.com> wrote:

Did you get a solution for this?


On Tue, 22 Dec 2015 at 20:24 raja kbv <rajakbv@yahoo.com.invalid> wrote:

Hi,


I am new to spark.


I have a text file with below structure.


 
(employeeID: Int, Name: String, ProjectDetails: JsonObject{[{ProjectName, Description, Duriation,
Role}]})
Eg:
(123456, Employee1, {“ProjectDetails”:[
                                                         { “ProjectName”: “Web Develoement”,
“Description” : “Online Sales website”, “Duration” : “6 Months” , “Role”
: “Developer”}
                                                         { “ProjectName”: “Spark Develoement”,
“Description” : “Online Sales Analysis”, “Duration” : “6 Months” , “Role”
: “Data Engineer”}
                                                         { “ProjectName”: “Scala Training”,
“Description” : “Training”, “Duration” : “1 Month” }
                                                          ]
                                                }
 
 
Could someone help me to parse & flatten the record as below dataframe using scala?
 
employeeID,Name, ProjectName, Description, Duration, Role
123456, Employee1, Web Develoement, Online Sales website, 6 Months , Developer
123456, Employee1, Spark Develoement, Online Sales Analysis, 6 Months, Data Engineer
123456, Employee1, Scala Training, Training, 1 Month, null
 


Thank you in advance.


Regards,
Raja
Mime
View raw message