From user-return-48550-apmail-spark-user-archive=spark.apache.org@spark.apache.org Thu Dec 24 05:31:25 2015 Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 191D718E42 for ; Thu, 24 Dec 2015 05:31:25 +0000 (UTC) Received: (qmail 49865 invoked by uid 500); 24 Dec 2015 05:31:21 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 49760 invoked by uid 500); 24 Dec 2015 05:31:21 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 49750 invoked by uid 99); 24 Dec 2015 05:31:21 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Dec 2015 05:31:21 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id F08EFC3314 for ; Thu, 24 Dec 2015 05:31:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.898 X-Spam-Level: ** X-Spam-Status: No, score=2.898 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=yahoo.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 2IddQK4ydTkU for ; Thu, 24 Dec 2015 05:31:16 +0000 (UTC) Received: from nm1-vm10.bullet.mail.sg3.yahoo.com (nm1-vm10.bullet.mail.sg3.yahoo.com [106.10.148.97]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 98FA9439AD for ; Thu, 24 Dec 2015 05:31:15 +0000 (UTC) Received: from [106.10.166.63] by nm1.bullet.mail.sg3.yahoo.com with NNFMP; 24 Dec 2015 05:31:06 -0000 Received: from [106.10.167.128] by tm20.bullet.mail.sg3.yahoo.com with NNFMP; 24 Dec 2015 05:31:06 -0000 Received: from [127.0.0.1] by smtp101.mail.sg3.yahoo.com with NNFMP; 24 Dec 2015 05:31:06 -0000 X-Yahoo-Newman-Id: 647913.34233.bm@smtp101.mail.sg3.yahoo.com Message-ID: <647913.34233.bm@smtp101.mail.sg3.yahoo.com> X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: BKL1v.AVM1k1GQAhJZPygWB3HDyKTQOTKFyDgtw3e0drEXr VSyP4niHgBs11AsIBhKVYadzNMUCJCCMDHjNtIqzsRna3YpFKnf3BlDnJ7op 9b_SqjawyAfoGuo3O29PuGO_DeeGy4dC5aMkC4QDavBNnNNb0LNRKkYtlCTq p4fDCLwIcqSUs42OvwmfFS5oGEqLJLq0MO.D9HoPIA_9PyzdVPBGKMjIa1lH RilvUm.syX31q47rsUejmr6rhb.XIoleBn.RXRkEj6lzU6vRFgGpGMvszTr6 n5l2Ojhve9FHLI4zRTcmYZmYmWMgaz.fGdCf3fivedQEalcBKrlaN6ZGNCde 4MXPC1FRoNxpOcgNDwqa9v2pbFFBdBVGeH3Sph53GHkhAxwpHmKJTPgKLCbl vSLRkOH15PScnQI5P1qHBuGpRaYXOY4FsXZv7mYxr4yjIykBNQilcPXfrqKH nOrAF0cFl1aH1zIpgsjqNkBE9xJoSmBf8Rwcc1k3CDos0vlLbb90As71QK9N eb_Ssilk2GuYBuLtt0EDS1kdO5cKm1fY- X-Yahoo-SMTP: 8EZoiAiswBC8dQ9SMqLSYZ6d8F0- MIME-Version: 1.0 To: Eran Witkon , "user@spark.apache.org" From: Bharathi Raja Subject: RE: How to Parse & flatten JSON object in a text file using Spark &Scala into Dataframe Date: Thu, 24 Dec 2015 11:00:36 +0530 In-Reply-To: References: <406653446.2243485.1450808486093.JavaMail.yahoo.ref@mail.yahoo.com> <406653446.2243485.1450808486093.JavaMail.yahoo@mail.yahoo.com> Content-Type: multipart/alternative; boundary="_95E0D2C1-AE2A-452E-B09F-66B2F2211121_" --_95E0D2C1-AE2A-452E-B09F-66B2F2211121_ Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Hi Eran, I didn't get the solution yet.=20 Thanks, Raja -----Original Message----- From: "Eran Witkon" Sent: =E2=80=8E12/=E2=80=8E23/=E2=80=8E2015 8:17 PM To: "raja kbv" ; "user@spark.apache.org" Subject: Re: How to Parse & flatten JSON object in a text file using Spark = &Scala into Dataframe Did you get a solution for this? On Tue, 22 Dec 2015 at 20:24 raja kbv wrote: Hi, I am new to spark. I have a text file with below structure. =20 (employeeID: Int, Name: String, ProjectDetails: JsonObject{[{ProjectName, D= escription, Duriation, Role}]}) Eg: (123456, Employee1, {=E2=80=9CProjectDetails=E2=80=9D:[ { =E2=80=9CProject= Name=E2=80=9D: =E2=80=9CWeb Develoement=E2=80=9D, =E2=80=9CDescription=E2= =80=9D : =E2=80=9COnline Sales website=E2=80=9D, =E2=80=9CDuration=E2=80=9D= : =E2=80=9C6 Months=E2=80=9D , =E2=80=9CRole=E2=80=9D : =E2=80=9CDeveloper= =E2=80=9D} { =E2=80=9CProject= Name=E2=80=9D: =E2=80=9CSpark Develoement=E2=80=9D, =E2=80=9CDescription=E2= =80=9D : =E2=80=9COnline Sales Analysis=E2=80=9D, =E2=80=9CDuration=E2=80= =9D : =E2=80=9C6 Months=E2=80=9D , =E2=80=9CRole=E2=80=9D : =E2=80=9CData E= ngineer=E2=80=9D} { =E2=80=9CProject= Name=E2=80=9D: =E2=80=9CScala Training=E2=80=9D, =E2=80=9CDescription=E2=80= =9D : =E2=80=9CTraining=E2=80=9D, =E2=80=9CDuration=E2=80=9D : =E2=80=9C1 M= onth=E2=80=9D } ] } =20 =20 Could someone help me to parse & flatten the record as below dataframe usin= g scala? =20 employeeID,Name, ProjectName, Description, Duration, Role 123456, Employee1, Web Develoement, Online Sales website, 6 Months , Develo= per 123456, Employee1, Spark Develoement, Online Sales Analysis, 6 Months, Data= Engineer 123456, Employee1, Scala Training, Training, 1 Month, null =20 Thank you in advance. Regards, Raja= --_95E0D2C1-AE2A-452E-B09F-66B2F2211121_ Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset="utf-8"
Hi Eran, I didn't get the solution yet.

Thanks,Raja

From: Eran Witkon
Sent: =E2=80=8E12/=E2= =80=8E23/=E2=80=8E2015 8:17 PM
To: raja kbv; use= r@spark.apache.org
Subject: Re: How to Parse & fl= atten JSON object in a text file using Spark &Scala into Dataframe

Did you get a solution for this?
On Tue, 22 Dec 2015 at 20:24 raja kbv <rajakbv@yaho= o.com.invalid> wrote:
Hi,

I am new to spark.

I have a = text file=0A= with below structure.

=0A= =0A=
 
=0A= =0A=
(employeeID: Int, Name: String, ProjectDetails:=0A= JsonObject{[{ProjectName, Description, Duriation, Role}]})
=0A= =0A=
Eg:
=0A= =0A=
(123456, Employee1, {=E2=80=9CProjectDetails=E2=80=9D:[
=0A= =0A=
          &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;       =0A= {=0A= =E2=80=9CProjectName=E2=80=9D: =E2=80=9CWeb Develoement=E2=80=9D, =E2=80=9C= Description=E2=80=9D : =E2=80=9COnline Sales website=E2=80=9D,=0A= =E2=80=9CDuration=E2=80=9D : =E2=80=9C6 Months=E2=80=9D , =E2=80=9CRole=E2= =80=9D : =E2=80=9CDeveloper=E2=80=9D}
=0A= =0A=
          &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;        { =E2=80=9CProjectNa= me=E2=80=9D: =E2=80=9CSpark Develoement=E2=80=9D, =E2=80=9CDescription=E2= =80=9D : =E2=80=9COnline Sales=0A= Analysis=E2=80=9D, =E2=80=9CDuration=E2=80=9D : =E2=80=9C6 Months=E2=80=9D = , =E2=80=9CRole=E2=80=9D : =E2=80=9CData Engineer=E2=80=9D}
=0A= =0A=
          &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;        { =E2=80=9CProjectNa= me=E2=80=9D: =E2=80=9CScala Training=E2=80=9D, =E2=80=9CDescription=E2=80= =9D : =E2=80=9CTraining=E2=80=9D,=0A= =E2=80=9CDuration=E2=80=9D : =E2=80=9C1 Month=E2=80=9D }
=0A= =0A=
          &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;        =0A= ]
=0A= =0A=
          &nbs= p;            &= nbsp;           &nbs= p;           =0A= }
=0A= =0A=
 
=0A= =0A=
 
=0A= =0A=
Could someone help me to parse & flatten the record as=0A= below dataframe using scala?
=0A= =0A=
 
=0A= =0A=
employeeID,Name, ProjectName, Description, Duration, Role
=0A= =0A=
123456, Employee1, Web Develoement, Online Sales website, 6=0A= Months , Developer
=0A= =0A=
123456, Employee1, Spark Develoement, Online Sales Analysis,=0A= 6 Months, Data Engineer
=0A= =0A=
123456, Employee1, Scala Training, Training, 1 Month, null
=0A= =0A=
 

Thank you in advance.

Regards,
Raja<= /div>
=0A= = --_95E0D2C1-AE2A-452E-B09F-66B2F2211121_--