spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <>
Subject Re: Remove support for Hadoop 2.5 and earlier?
Date Fri, 03 Feb 2017 13:40:21 GMT

> On 3 Feb 2017, at 11:52, Sean Owen <> wrote:
> Last year we discussed removing support for things like Hadoop 2.5 and earlier. It was
deprecated in Spark 2.1.0. I'd like to go ahead with this, so am checking whether anyone has
strong feelings about it.
> The original rationale for separate Hadoop profile was bridging the significant difference
between Hadoop 1 and 2, and the moderate differences between 2.0 alpha, 2.1 beta, and 2.2
final. 2.2 is really the "stable" Hadoop 2, and releases from there to current are comparatively
very similar from Spark's perspective. We nevertheless continued to make a separate build
profile for every minor release, which isn't serving much purpose.
> The argument here is mostly that it will simplify code a little bit (less reflection,
fewer profiles), simplify the build -- we now have 6 profiles x 2 build systems x 4 major
branches in Jenkins, whereas master could go down to 2 profiles. 
> Realistically, I don't know how much we'd do to support Hadoop before 2.6 anyway. Any
distro user is long since on 2.6+.

Hadoop 2.5 doesnt work properly on Java 7, so support for it is kind of implicitly false.
indeed, Hadoop 2.6 only works on Java 7 if you disable kerberos, which isn't something I'd
recommend in a shared physical cluster, though you may be able to get away with in an ephemeral
one where you lock down all the ports.

To unsubscribe e-mail:

View raw message