spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Hadoop version(s) compatible with spark-2.4.3-bin-without-hadoop-scala-2.12
Date Mon, 20 May 2019 21:43:12 GMT
Re: 1), I think we tried to fix that on the build side and it requires
flags that not all tar versions (i.e. OS X) have. But that's

I think the Avro + Parquet dependency situation is generally
problematic -- see JIRA for some details. But yes I'm not surprised if
Spark has a different version from Hadoop 2.7.x and that would cause
problems -- if using Avro. I'm not sure the mistake is that the JARs
are missing, as I think this is supposed to be a 'provided'
dependency, but I haven't looked into it. If there's any easy obvious
correction to be made there, by all means.

Not sure what the deal is with jline... I'd expect that's in the
"hadoop-provided" distro? That one may be a real issue if it's
considered provided but isn't used that way.

On Mon, May 20, 2019 at 4:15 PM Koert Kuipers <> wrote:
> we run it without issues on hadoop 2.6 - 2.8 on top of my head.
> we however do some post-processing on the tarball:
> 1) we fix the ownership of the files inside the tar.gz file (should be uid/gid 0/0, otherwise
untarring by root can lead to ownership by unknown user).
> 2) add avro-1.8.2.jar and jline-2.14.6.jar to jars folder. i believe these jars missing
in provided profile is simply a mistake.
> best,
> koert
> On Mon, May 20, 2019 at 3:37 PM Michael Heuer <> wrote:
>> Hello,
>> Which Hadoop version or versions are compatible with Spark 2.4.3 and Scala 2.12?
>> The binary distribution spark-2.4.3-bin-without-hadoop-scala-2.12.tgz is missing
avro-1.8.2.jar, so when attempting to run with Hadoop 2.7.7 there are classpath conflicts
at runtime, as Hadoop 2.7.7 includes avro-1.7.4.jar.
>>    michael

To unsubscribe e-mail:

View raw message