hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Hung <jyhung2...@gmail.com>
Subject Re: [DISCUSS] Making submarine to different release model like Ozone
Date Thu, 31 Jan 2019 19:51:36 GMT
+1. This is important for improving the deep learning on hadoop story.
There's recently a lot of momentum for this, and decoupling
submarine/hadoop will help it continue.

Jonathan Hung


On Thu, Jan 31, 2019 at 11:04 AM Wangda Tan <wheeleast@gmail.com> wrote:

> Hi devs,
>
> Since we started submarine-related effort last year, we received a lot of
> feedbacks, several companies (such as Netease, China Mobile, etc.)  are
> trying to deploy Submarine to their Hadoop cluster along with big data
> workloads. Linkedin also has big interests to contribute a Submarine TonY (
> https://github.com/linkedin/TonY) runtime to allow users to use the same
> interface.
>
> From what I can see, there're several issues of putting Submarine under
> yarn-applications directory and have same release cycle with Hadoop:
>
> 1) We started 3.2.0 release at Sep 2018, but the release is done at Jan
> 2019. Because of non-predictable blockers and security issues, it got
> delayed a lot. We need to iterate submarine fast at this point.
>
> 2) We also see a lot of requirements to use Submarine on older Hadoop
> releases such as 2.x. Many companies may not upgrade Hadoop to 3.x in a
> short time, but the requirement to run deep learning is urgent to them. We
> should decouple Submarine from Hadoop version.
>
> And why we wanna to keep it within Hadoop? First, Submarine included some
> innovation parts such as enhancements of user experiences for YARN
> services/containerization support which we can add it back to Hadoop later
> to address common requirements. In addition to that, we have a big overlap
> in the community developing and using it.
>
> There're several proposals we have went through during Ozone merge to trunk
> discussion:
>
> https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3CCAHfHakH6_m3YLdf5a2KQ8+w-5fbVX5aHFgS-x1VaJW8gmnzRLg@mail.gmail.com%3E
>
> I propose to adopt Ozone model: which is the same master branch, different
> release cycle, and different release branch. It is a great example to show
> agile release we can do (2 Ozone releases after Oct 2018) with less
> overhead to setup CI, projects, etc.
>
> *Links:*
> - JIRA: https://issues.apache.org/jira/browse/YARN-8135
> - Design doc
> <
> https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit
> >
> - User doc
> <
> https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/Index.html
> >
> (3.2.0
> release)
> - Blogposts, {Submarine} : Running deep learning workloads on Apache Hadoop
> <
> https://hortonworks.com/blog/submarine-running-deep-learning-workloads-apache-hadoop/
> >,
> (Chinese Translation: Link <https://www.jishuwen.com/d/2Vpu>)
> - Talks: Strata Data Conf NY
> <
> https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/68289
> >
>
> Thoughts?
>
> Thanks,
> Wangda Tan
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message