spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonhy Stack <>
Subject (python) Spark .textFile(s3://…) access denied 403 with valid credentials
Date Tue, 07 Mar 2017 15:21:37 GMT
In order to access my S3 bucket i have exported my creds


I can verify that everything works by doing

    aws s3 ls mybucket

I can also verify with boto3 that it works in python

    resource = boto3.resource("s3", region_name="us-east-1")
    resource.Object("mybucket", "text/") \
                .put(Body=open("", "rb"),ContentType="text/x-py")

This works and I can see the file in the bucket.

However when I do this with spark:

    spark_context = SparkContext()
    sql_context = SQLContext(spark_context)

I get a nice

    > Caused by: org.jets3t.service.S3ServiceException: Service Error
    > Message. -- ResponseCode: 403, ResponseStatus: Forbidden, XML Error
    > Message: <?xml version="1.0"
    > encoding="UTF-8"?><Error><Code>InvalidAccessKeyId</Code><Message>The
    > AWS Access Key Id you provided does not exist in our
    > records.</Message><AWSAccessKeyId>[MY_ACCESS_KEY]</AWSAccess

this is how I submit the job locally

spark-submit --packages com.amazonaws:aws-java-sdk-pom

Why does it works with command line + boto3 but spark is chocking ?

View raw message