spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jyoti Ranjan Mahapatra <jyot...@microsoft.com.INVALID>
Subject RE: Unable to read multiple JSON.Gz File.
Date Tue, 02 Oct 2018 20:48:35 GMT
Hi Mahendar,
Which version of spark and Hadoop are you using?
I tried it on spark2.3.1 with Hadoop 2.7.3 and it works for a folder containing multiple gz
files.


From: Mahender Sarangam <mahender.bigdata@outlook.com>
Sent: Monday, October 1, 2018 2:00 AM
To: user@spark.apache.org
Subject: Unable to read multiple JSON.Gz File.



I’m trying to read multiple .json.gz files from a Blob storage path using the below scala
code. But I’m unable to read the data from the files or print the schema. If the files are
not compressed as .gz then we are able to read all the files into the Dataframe.
I’ve even tried giving *.gz but no luck.
 val df = spark.read.json("wasb://XYZ@AzureStorage.blob.core.windows.net/sourcePath/"<mailto:wasb://XYZ@AzureStorage.blob.core.windows.net/sourcePath/>)
Mime
View raw message