spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean Georges Perrin <...@jgp.net>
Subject Re: Quick one... AWS SDK version?
Date Sat, 07 Oct 2017 19:50:09 GMT

Hey Marco,

I am actually reading from S3 and I use 2.7.3, but I inherited the project and they use some
AWS API from Amazon SDK, which version is like from yesterday :) so it’s confused and AMZ
is changing its version like crazy so it’s a little difficult to follow. Right now I went
back to 2.7.3 and SDK 1.7.4...

jg


> On Oct 7, 2017, at 15:34, Marco Mistroni <mmistroni@gmail.com> wrote:
> 
> Hi JG
>  out of curiosity what's ur usecase? are you writing to S3? you could use Spark to do
that , e.g using hadoop package  org.apache.hadoop:hadoop-aws:2.7.1 ..that will download the
aws client which is in line with hadoop 2.7.1?
> 
> hth
>  marco
> 
>> On Fri, Oct 6, 2017 at 10:58 PM, Jonathan Kelly <jonathakamzn@gmail.com> wrote:
>> Note: EMR builds Hadoop, Spark, et al, from source against specific versions of certain
packages like the AWS Java SDK, httpclient/core, Jackson, etc., sometimes requiring some patches
in these applications in order to work with versions of these dependencies that differ from
what the applications may support upstream.
>> 
>> For emr-5.8.0, we have built Hadoop and Spark (the Spark Kinesis connector, that
is, since that's the only part of Spark that actually depends upon the AWS Java SDK directly)
against AWS Java SDK 1.11.160 instead of the much older version that vanilla Hadoop 2.7.3
would otherwise depend upon.
>> 
>> ~ Jonathan
>> 
>>> On Wed, Oct 4, 2017 at 7:17 AM Steve Loughran <stevel@hortonworks.com>
wrote:
>>>> On 3 Oct 2017, at 21:37, JG Perrin <jperrin@lumeris.com> wrote:
>>>> 
>>>> Sorry Steve – I may not have been very clear: thinking about aws-java-sdk-z.yy.xxx.jar.
To the best of my knowledge, none is bundled with Spark.
>>> 
>>> 
>>> I know, but if you are talking to s3 via the s3a client, you will need the SDK
version to match the hadoop-aws JAR of the same version of Hadoop your JARs have. Similarly,
if you were using spark-kinesis, it needs to be in sync there. 
>>>>  
>>>> From: Steve Loughran [mailto:stevel@hortonworks.com] 
>>>> Sent: Tuesday, October 03, 2017 2:20 PM
>>>> To: JG Perrin <jperrin@lumeris.com>
>>>> Cc: user@spark.apache.org
>>>> Subject: Re: Quick one... AWS SDK version?
>>>>  
>>>>  
>>>> On 3 Oct 2017, at 02:28, JG Perrin <jperrin@lumeris.com> wrote:
>>>>  
>>>> Hey Sparkians,
>>>>  
>>>> What version of AWS Java SDK do you use with Spark 2.2? Do you stick with
the Hadoop 2.7.3 libs?
>>>>  
>>>> You generally to have to stick with the version which hadoop was built with
I'm afraid...very brittle dependency. 
> 

Mime
View raw message