spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Jankovic <jankovic.dan...@gmail.com>
Subject Re: reading a csv.gz file from sagemaker using pyspark kernel mode
Date Thu, 08 Oct 2020 09:35:22 GMT
Hi,

I don't work much with either technology, but it seems that you didn't fill
all the infos needed for connecting/reading to your s3. You need full s3
path (I doubt your bucket is really s3://testdata) as well as access
information. The message you are getting is Access denied because you
didn't fill all the required information.

BR,
Daniel

On Wed, Oct 7, 2020 at 3:44 PM cloudytech43 <
cloudytechi.intellipaat@gmail.com> wrote:

> I am trying to read a compressed CSV file in pyspark. but I am unable to
> read
> in pyspark kernel mode in sagemaker.
>
> The same file I can read using pandas when the kernel is conda-python3 (in
> sagemaker)
>
> What I tried :
>
> file1 =  's3://testdata/output1.csv.gz'
> file1_df = spark.read.csv(file1, sep='\t')
>
> Error message :
>
> An error was encountered:
> An error occurred while calling 104.csv.
> : java.io.IOException:
>
> com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception:
> Access Denied (Service: Amazon S3; Status Code: 403; Error Code:
> AccessDenied; Request ID: 7FF77313; S3 Extended Request ID:
>
> Kindly let me know if I am missing anything
>
>
>
> ______________________
> Trainer for  Spark Training in Hyderabad
> <https://intellipaat.com/apache-spark-scala-training-hyderabad/>  .
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Mime
View raw message