spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Kelly <jonathaka...@gmail.com>
Subject Re: Quick one... AWS SDK version?
Date Sun, 08 Oct 2017 16:13:31 GMT
Tushar,

Yes, the hadoop-aws jar installed on an emr-5.8.0 cluster was built with
AWS Java SDK 1.11.160, if that’s what you mean.

~ Jonathan
On Sun, Oct 8, 2017 at 8:42 AM Tushar Sudake <etushar89@gmail.com> wrote:

> Hi Jonathan,
>
> Does that mean Hadoop-AWS 2.7.3 too is built against AWS SDK 1.11.160 and
> not 1.7.4?
>
> Thanks.
>
>
> On Oct 7, 2017 3:50 PM, "Jean Georges Perrin" <jgp@jgp.net> wrote:
>
>
> Hey Marco,
>
> I am actually reading from S3 and I use 2.7.3, but I inherited the project
> and they use some AWS API from Amazon SDK, which version is like from
> yesterday :) so it’s confused and AMZ is changing its version like crazy so
> it’s a little difficult to follow. Right now I went back to 2.7.3 and SDK
> 1.7.4...
>
> jg
>
>
> On Oct 7, 2017, at 15:34, Marco Mistroni <mmistroni@gmail.com> wrote:
>
> Hi JG
>  out of curiosity what's ur usecase? are you writing to S3? you could use
> Spark to do that , e.g using hadoop package
> org.apache.hadoop:hadoop-aws:2.7.1 ..that will download the aws client
> which is in line with hadoop 2.7.1?
>
> hth
>  marco
>
> On Fri, Oct 6, 2017 at 10:58 PM, Jonathan Kelly <jonathakamzn@gmail.com>
> wrote:
>
>> Note: EMR builds Hadoop, Spark, et al, from source against specific
>> versions of certain packages like the AWS Java SDK, httpclient/core,
>> Jackson, etc., sometimes requiring some patches in these applications in
>> order to work with versions of these dependencies that differ from what the
>> applications may support upstream.
>>
>> For emr-5.8.0, we have built Hadoop and Spark (the Spark Kinesis
>> connector, that is, since that's the only part of Spark that actually
>> depends upon the AWS Java SDK directly) against AWS Java SDK 1.11.160
>> instead of the much older version that vanilla Hadoop 2.7.3 would otherwise
>> depend upon.
>>
>> ~ Jonathan
>>
>> On Wed, Oct 4, 2017 at 7:17 AM Steve Loughran <stevel@hortonworks.com>
>> wrote:
>>
>>> On 3 Oct 2017, at 21:37, JG Perrin <jperrin@lumeris.com> wrote:
>>>
>>> Sorry Steve – I may not have been very clear: thinking about
>>> aws-java-sdk-z.yy.xxx.jar. To the best of my knowledge, none is bundled
>>> with Spark.
>>>
>>>
>>>
>>> I know, but if you are talking to s3 via the s3a client, you will need
>>> the SDK version to match the hadoop-aws JAR of the same version of Hadoop
>>> your JARs have. Similarly, if you were using spark-kinesis, it needs to be
>>> in sync there.
>>>
>>>
>>> *From:* Steve Loughran [mailto:stevel@hortonworks.com
>>> <stevel@hortonworks.com>]
>>> *Sent:* Tuesday, October 03, 2017 2:20 PM
>>> *To:* JG Perrin <jperrin@lumeris.com>
>>> *Cc:* user@spark.apache.org
>>> *Subject:* Re: Quick one... AWS SDK version?
>>>
>>>
>>>
>>> On 3 Oct 2017, at 02:28, JG Perrin <jperrin@lumeris.com> wrote:
>>>
>>> Hey Sparkians,
>>>
>>> What version of AWS Java SDK do you use with Spark 2.2? Do you stick
>>> with the Hadoop 2.7.3 libs?
>>>
>>>
>>> You generally to have to stick with the version which hadoop was built
>>> with I'm afraid...very brittle dependency.
>>>
>>>
>
>

Mime
View raw message