spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Lian <lian.cs....@gmail.com>
Subject Re: Use Hadoop-3.2 as a default Hadoop profile in 3.0.0?
Date Sun, 17 Nov 2019 01:49:59 GMT
Dongjoon, I didn't follow the original Hive 2.3 discussion closely. I
thought the original proposal was to replace Hive 1.2 with Hive 2.3, which
seemed risky, and therefore we only introduced Hive 2.3 under the
hadoop-3.2 profile without removing Hive 1.2. But maybe I'm totally wrong
here...

Sean, Yuming's PR https://github.com/apache/spark/pull/26533 showed that
Hadoop 2 + Hive 2 + JDK 11 looks promising. My major motivation is not
about demand, but risk control: coupling Hive 2.3, Hadoop 3.2, and JDK 11
upgrade together looks too risky.

On Sat, Nov 16, 2019 at 4:03 AM Sean Owen <srowen@gmail.com> wrote:

> I'd prefer simply not making Hadoop 3 the default until 3.1+, rather
> than introduce yet another build combination. Does Hadoop 2 + Hive 2
> work and is there demand for it?
>
> On Sat, Nov 16, 2019 at 3:52 AM Wenchen Fan <cloud0fan@gmail.com> wrote:
> >
> > Do we have a limitation on the number of pre-built distributions? Seems
> this time we need
> > 1. hadoop 2.7 + hive 1.2
> > 2. hadoop 2.7 + hive 2.3
> > 3. hadoop 3 + hive 2.3
> >
> > AFAIK we always built with JDK 8 (but make it JDK 11 compatible), so
> don't need to add JDK version to the combination.
> >
> > On Sat, Nov 16, 2019 at 4:05 PM Dongjoon Hyun <dongjoon.hyun@gmail.com>
> wrote:
> >>
> >> Thank you for suggestion.
> >>
> >> Having `hive-2.3` profile sounds good to me because it's orthogonal to
> Hadoop 3.
> >> IIRC, originally, it was proposed in that way, but we put it under
> `hadoop-3.2` to avoid adding new profiles at that time.
> >>
> >> And, I'm wondering if you are considering additional pre-built
> distribution and Jenkins jobs.
> >>
> >> Bests,
> >> Dongjoon.
> >>
>

Mime
View raw message