hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayank Bansal <maban...@gmail.com>
Subject Re: Hadoop - Major releases
Date Mon, 09 Mar 2015 23:16:10 GMT
Hi Andrew,

I wish things are as simple as you are pointing out. At least they are not
for us so far.

Couple of things

1. We would be moving to Hadoop -3 (Not this year though) however I don't
see we can do another JDK upgrade so soon. So the point I am trying to make
is we should be supporting jdk 7 as well for Hadoop-3.

2. For the sake of JDK 8 and classpath isolation we shouldn't be making
another release as those can be supported in Hadoop 2 as well, so what is
the motivation of making Hadoop 3 so soon?

Thanks,

Mayank

On Mon, Mar 9, 2015 at 3:34 PM, Andrew Wang <andrew.wang@cloudera.com>
wrote:

> Hi Mayank,
>
> Note that Hadoop 3 does not mean the end of updates for Hadoop 2.x, which
> will keep supporting JDK7 for a while yet. Someone on the original thread
> also proposed keeping Hadoop 3 JDK7-source compatible to make backports to
> 2.x easier. I support this.
>
> Note also that the jump from Hadoop 1 to Hadoop 2 (which is what I assume
> was your previous migration) is a far, far more impactful change than what
> is being proposed for Hadoop 3. Hadoop 3 will look basically like a 2.x
> release except for the JDK8 bump and classpath isolation. The intent is to
> otherwise maintain wire and API compatibility.
>
> Overall your timeline sounds like it fits the schedule I proposed. If we
> release a 3.0 GA this year, it means you can upgrade to a baked 3.1 or 3.2
> next year. Seems like a sound upgrade procedure for a large cluster.
>
> Best,
> Andrew
>
> On Mon, Mar 9, 2015 at 2:24 PM, Mayank Bansal <mabansal@gmail.com> wrote:
>
> > Hi Guys,
> >
> > From my prospective @ ebay we are not going to upgrade to JDK 8 any time
> > soon we just upgraded to 7 and not want to move further at least this
> year
> > so I will request you guys not to drop the support for JDK 7 as that
> would
> > be very crucial for us to move forward.
> >
> > We also just completed our Hadoop 2 migration for all clusters this year
> > which we started earlier last year, so I don't think we can do again
> major
> > upgrades this year. Stabilizing the major releases takes lots of effort
> and
> > time, I think Hadoop 3.x makes sense at least for us next year.
> >
> > Thanks,
> >
> > Mayank
> >
> > On Mon, Mar 9, 2015 at 12:29 AM, Arun Murthy <acm@hortonworks.com>
> wrote:
> >
> > > Over the last few days, we have had lots of discussions that have
> > > intertwined several major themes:
> > >
> > >
> > >
> > > # When/why do we make major Hadoop releases?
> > >
> > > # When/how do we move to major JDK versions?
> > >
> > > # To a lesser extent, we have debated another theme: what do we do
> about
> > > trunk?
> > >
> > >
> > >
> > > For now, let's park JDK & trunk to treat them in a separate thread(s).
> > >
> > >
> > >
> > > For a while now, I've had a couple of lampposts in my head which I used
> > > for guidance - apologize for not sharing this broadly prior to this
> > > discussion, maybe putting it out here will help - certainly hope so.
> > >
> > >
> > >
> > >
> > >
> > > Major Releases
> > >
> > >
> > >
> > > Hadoop continues to benefit tremendously by the investment in
> stability,
> > > validation etc. put in by its *anchor* users: Yahoo, Facebook, Twitter,
> > > eBay, LinkedIn etc.
> > >
> > >
> > >
> > > A historical perspective...
> > >
> > >
> > >
> > > In it's lifetime, Apache Hadoop went from monthly to quarterly releases
> > > because, as Hadoop became more and more of a production system
> (starting
> > > with hadoop-0.16 and more so with hadoop 0.18), users could not absorb
> > the
> > > torrid pace of change.
> > >
> > >
> > >
> > > IMHO, we didn't go far enough in addressing the competing pressures of
> > > stability v/s rapid innovation.  We paid for it by losing one of our
> > anchor
> > > users - Facebook - around the time of hadoop-0.19 - they just forked.
> > >
> > >
> > >
> > > Around the same time, Yahoo hit the same problem (I know, I lived
> through
> > > it painfully) and got stuck with hadoop-0.20 for a *very* long time and
> > > forked to add Security rather than deal with the next major release
> > > (hadoop-0.21). Later on, Facebook did the same, and, unfortunately for
> > the
> > > community, is stuck - probably forever - on their fork of hadoop-0.20.
> > >
> > >
> > >
> > > Overall, these were dark days for the community: every anchor user was
> on
> > > their own fork, and it took a toll on the project.
> > >
> > >
> > >
> > > Recently, thankfully for Hadoop, we have had a period of relative
> > > stability with hadoop-1.x and hadoop-2.x. Even so, there were close
> > shaves:
> > > Yahoo was on hadoop-0.23 for a *very* long time - in fact, they are
> only
> > > just now finishing their migration to hadoop-2.x.
> > >
> > >
> > >
> > > I think the major lessons here are the obvious ones:
> > >
> > >
> > >
> > > # Compatibility matters
> > >
> > > # Maintaining ?multiple major releases, in parallel, is a big problem -
> > it
> > > leads to an unproductive, and risky, split in community investment
> along
> > > different lines.
> > >
> > >
> > >
> > >
> > >
> > > Looking Ahead
> > >
> > >
> > >
> > > Given the above, here are some thoughts for looking ahead:
> > >
> > >
> > >
> > > # Be very conservative about major releases - a major benefit is
> required
> > > (features) for the cost. Let's not compel our anchor users like Yahoo,
> > > Twitter, eBay, and LinkedIn to invest in previous releases rather than
> > the
> > > latest one. Let's hear more from them - and let's be very accommodating
> > to
> > > them - for they play a key role in keeping Hadoop healthy & stable.
> > >
> > >
> > >
> > > # Be conservative about dropping support for JDKs. In particular, let's
> > > hear from our anchor users on their plans for adoption jdk-1.8.
> LinkedIn
> > > has already moved to jdk-1.8, which is great for the validation , but
> > let's
> > > wait for the rest of our anchor users to move before we drop jdk-1.7.
> We
> > > did the same thing with jdk-1.6 - waited for them to move before we
> drop
> > > support for jdk-1.7.
> > >
> > >
> > >
> > > Overall, I'd love to hear more from Twitter, Yahoo, eBay and other
> anchor
> > > users on their plans for jdk-1.8 specifically, and on their overall
> > > appetite for hadoop-3.  Let's not finalize our plans for moving forward
> > > until this input has been considered.
> > >
> > >
> > >
> > > Thoughts?
> > >
> > >
> > > thanks,
> > > Arun
> > >
> > >
> > >
> > > Unfortunate that it's necessary disclaimers:
> > >
> > > # Before people point out vendor affiliations to lend unnecessary color
> > to
> > > my opinions, let me state that hadoop-2 v/s hadoop-3 is a non-issue for
> > us.
> > > For major HDP versions the key is, just, compatibility?... e.g. we ship
> > > major, but compatible, community releases such as hive-0.13/hive-0.14
> in
> > > HDP-2.x/HDP-2.x+1 etc.
> > >
> > > # Also, release management is a similar non-issue - we have already had
> > > several individuals step up in hadoop-2.x line. Expect more of the same
> > > from folks like Andrew, Karthik, Vinod, Steve etc.
> > >
> >
> >
> >
> > --
> > Thanks and Regards,
> > Mayank
> > Cell: 408-718-9370
> >
>



-- 
Thanks and Regards,
Mayank
Cell: 408-718-9370

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message