spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amjad ALSHABANI <ashshab...@gmail.com>
Subject Re: (python) Spark .textFile(s3://…) access denied 403 with valid credentials
Date Tue, 07 Mar 2017 16:56:51 GMT
Hi Jonhy,

What is the master you are using with spark-submit?

I ve had this problem before because Spark (different from CLI and boto3)
 was running in Yarn distributed mode (--master yarn) So the keys  were not
copied to all the executors' nodes so I have had to submit my spark job as
following:

$ spark-submit --master yarn-client --conf
"spark.executor.extraJavaOptions=-Daws.accessKeyId=ACCESSKEY
-Daws.secretKey=SECRETKEY"
....

I hope this will help


Amjad

On Tue, Mar 7, 2017 at 4:21 PM, Jonhy Stack <so.jonhy@gmail.com> wrote:

> In order to access my S3 bucket i have exported my creds
>
>     export AWS_SECRET_ACCESS_KEY=
>     export AWS_ACCESSS_ACCESS_KEY=
>
> I can verify that everything works by doing
>
>     aws s3 ls mybucket
>
> I can also verify with boto3 that it works in python
>
>     resource = boto3.resource("s3", region_name="us-east-1")
>     resource.Object("mybucket", "text/text.py") \
>                 .put(Body=open("text.py", "rb"),ContentType="text/x-py")
>
> This works and I can see the file in the bucket.
>
> However when I do this with spark:
>
>     spark_context = SparkContext()
>     sql_context = SQLContext(spark_context)
>     spark_context.textFile("s3://mybucket/my/path/*)
>
> I get a nice
>
>     > Caused by: org.jets3t.service.S3ServiceException: Service Error
>     > Message. -- ResponseCode: 403, ResponseStatus: Forbidden, XML Error
>     > Message: <?xml version="1.0"
>     > encoding="UTF-8"?><Error><Code>InvalidAccessKeyId</Code><Message>The
>     > AWS Access Key Id you provided does not exist in our
>     > records.</Message><AWSAccessKeyId>[MY_ACCESS_KEY]</AWSAccess
> KeyId><RequestId>XXXXX</RequestId><HostId>xxxxxxx</HostId></Error>
>
> this is how I submit the job locally
>
> spark-submit --packages com.amazonaws:aws-java-sdk-pom
> :1.11.98,org.apache.hadoop:hadoop-aws:2.7.3 test.py
>
> Why does it works with command line + boto3 but spark is chocking ?
>

Mime
View raw message