sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abraham Elmahrek <...@cloudera.com>
Subject Re: Discussing solutions to Sqoop1 and Sqoop2 confusion (was: Code name for Sqoop 2)
Date Fri, 01 Aug 2014 20:00:30 GMT
+1 for proposal 1 as well.


On Fri, Aug 1, 2014 at 11:46 AM, Venkat Ranganathan <
vranganathan@hortonworks.com> wrote:

> +1 for propsal 1 also
>
> Thanks
>
> Venkat
>
> On Fri, Aug 1, 2014 at 9:38 AM, Jarek Jarcec Cecho <jarcec@apache.org>
> wrote:
> > I don’t have any other suggestion either, so let’s discuss which one
> would people prefer?
> >
> > I’m personally in favor of proposal 1).
> >
> > Jarcec
> >
> > On Jul 28, 2014, at 10:04 AM, Gwen Shapira <gshapira@cloudera.com>
> wrote:
> >
> >> Thanks for the great summary. I don't have additional suggestions.
> >>
> >> Gwen
> >>
> >> On Sun, Jul 27, 2014 at 11:03 AM, Arvind Prabhakar <arvind@apache.org>
> wrote:
> >>> Thanks Gwen and Jarcec. It appears that we all agree to the few basic
> >>> points below:
> >>>
> >>> a) Sqoop2 is promising effort although not near completion. We agree
> that
> >>> there is no need to discuss shutting that down at this time.
> >>> b) The naming of Sqoop2 is such that it raises expectations in
> >>> users/adopters to be better than Sqoop(1) and thus leads to confusion.
> >>>
> >>> The second point (b) above is the key issue that needs resolution. The
> >>> options discussed thus far are as follows:
> >>>
> >>> 1. Put a code name for Sqoop2 so that it is not confused with Sqoop(1).
> >>> This seems to have good community support.
> >>> 2. Use a explicit labels such as "stable" vs "beta/alpha/experimental"
> for
> >>> various Sqoop releases.
> >>> 3. Use explicit UI messaging to warn Sqoop2 users that it is not the
> same
> >>> as Sqoop(1) and is far behind on feature completeness and stability.
> There
> >>> seems to be some concerns around how this can be done given the
> >>> client/server architecture of Sqoop2.
> >>> 4. A combination of options 2 and 3 above.
> >>>
> >>> Are there any suggestions to mitigate this problem? Perhaps we should
> >>> cross-post this thread to user list as well to see if they agree with
> the
> >>> options here and/or if they have any other suggestions.
> >>>
> >>> Regards,
> >>> Arvind Prabhakar
> >>>
> >>>
> >>>
> >>> On Sat, Jul 26, 2014 at 6:50 PM, Jarek Jarcec Cecho <jarcec@apache.org
> >
> >>> wrote:
> >>>
> >>>> Hi Arvind,
> >>>> thank you very much for sharing your concerns! You’ve risen a very
> good
> >>>> points.
> >>>>
> >>>> I personally see value in Sqoop 2 as the new architecture will allow
> us to
> >>>> cover much more use cases, various compliancy regulations and will
> >>>> eventually simplify user’s life. Based on the recent increase in dev
> >>>> activity, it seems that I’m not the only one who do believes in that
> and
> >>>> hence I strongly believe that discontinuing the effort doesn’t seem
> as the
> >>>> right way to go. I’m more then happy to discuss this topic further
if
> you
> >>>> believe that it’s the right way though.
> >>>>
> >>>> Having said that I do believe in Sqoop 2, I have to second Gwen that
> >>>> current situation is very confusing to our users. I’ve seen
> significant
> >>>> number of users confused about why 1.99.4 do not have Avro/HBase/Hive
> >>>> integration when Sqoop 1 already have that. I was anticipating the
> >>>> confusion and hence I’ve suggested to use version number 1.99.x
> instead of
> >>>> 2.0.0 back when we were working on getting the first cut out of the
> door. I
> >>>> hoped that version 1.99.x will make obvious to everybody that it’s
not
> >>>> “2.0.0” quite yet. Sadly it seems that this alone did not helped
as
> much as
> >>>> I hoped.
> >>>>
> >>>> Hence I do see value in changing our public messaging as you’ve
> suggested
> >>>> to make the message more clearer. I personally like the idea with
> code name
> >>>> as that is quite popular in other projects and companies (remember
> Windows
> >>>> Longorn?) and it seems to be conveying the message. I do see a lot of
> >>>> variability of using that code name though - I don’t think that we
> >>>> necessarily have to remove any possible reference to “Sqoop 2” from
> the
> >>>> face of earth. I believe that the code name is very well suited for
> our
> >>>> webpage, wiki and perhaps a blog posts to make obvious that there is
> just
> >>>> one single stable Sqoop version and then some ongoing effort that do
> have
> >>>> available several cuts. I believe that just by doing that we will
> decrease
> >>>> confusion about what version should user download and use. We can
> discuss
> >>>> to what extent we want to push the code name and to what extent we
> will
> >>>> keep the reference to “Sqoop 2”. After all I’m confident that
in not
> too
> >>>> distant future, we will have Sqoop 2  that will offer the comparable
> >>>> capability and features as Sqoop 1.
> >>>>
> >>>> I don’t think that the code name is the only idea that will decrease
> the
> >>>> immediate user confusion and hence I’m happy to hear others thoughts!
> >>>>
> >>>> Jarcec
> >>>>
> >>>> On Jul 26, 2014, at 6:00 PM, Gwen Shapira <gshapira@cloudera.com>
> wrote:
> >>>>
> >>>>> Thanks Arvind for your detailed explanation.
> >>>>>
> >>>>> I agree that naming releases stable and alpha is a good idea. I
don't
> >>>>> agree that it will solve the issue, but we can't know until we try.
> >>>>>
> >>>>> Considering that Sqoop2 is intentionally a client-server architecture
> >>>>> with multiple clients and a REST API as an additional access point,
I
> >>>>> believe that it is not feasible to mark UI as beta.
> >>>>>
> >>>>> I want to stress that I personally believe that Sqoop2 is a very
> >>>>> viable branch effort, to the extent that I am actively contributing
> to
> >>>>> it.
> >>>>> As Sqoop2 becomes more and more viable alternative to Sqoop1, we
need
> >>>>> to prepare, as a community, to support both versions.
> >>>>>
> >>>>> Considering the number of features currently in Sqoop1 and the number
> >>>>> of production Sqoop1 users, I can see us supporting both versions
for
> >>>>> the next 2 years. Making it easy for the community to support both
is
> >>>>> my top concern here. Having to resolve endless confusions for two
> >>>>> years doesn't seem like a happy future to me. I see the Python
> >>>>> community fighting the same issue since they broke compatibility
> >>>>> between versions 2 and 3. I'd like to see us learn from those
> mistakes
> >>>>> and do better.
> >>>>>
> >>>>> I agree that a discussion would have been better than a vote. I
was
> >>>>> under the mistaken impression that there is a consensus on the
> matter.
> >>>>> I renamed the thread to make it clear that we are interested in
> >>>>> hearing opinions from the rest of the community on this subject.
> >>>>>
> >>>>>
> >>>>> Bike-sheddingly yours,
> >>>>>
> >>>>> Gwen Shapira
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Sat, Jul 26, 2014 at 4:44 PM, Arvind Prabhakar <arvind@apache.org
> >
> >>>> wrote:
> >>>>>> Thanks for the detailed pointers Gwen. I understand your concerns
> better
> >>>>>> now. My understanding from these threads as well as what you
have
> >>>> described
> >>>>>> is that the confusion you refer to stems from the fact that
Sqoop2
> is
> >>>> not
> >>>>>> at feature parity with Sqoop(1) yet.
> >>>>>>
> >>>>>> It will be great to *discuss* what are the various ways to address
> this
> >>>> and
> >>>>>> then call a vote to decide upon the approach to use. Some other
> >>>> approaches
> >>>>>> that I can suggest are:
> >>>>>>
> >>>>>> 1. Calling Sqoop1 explicitly as "stable" in our downloads section,
> or
> >>>> even
> >>>>>> within the release label. So instead of Sqoop-1.4.5, it would
be
> >>>>>> Sqoop-1.4.5-stable.
> >>>>>>
> >>>>>> 2. Alternatively calling Sqoop2 explicitly "alpha", "beta" or
> >>>>>> "experimental". Eg - Sqoop-1.99.4 would become Sqoop-1.99.4-beta.
> >>>>>>
> >>>>>> 3. Or perhaps a combination of both of these.
> >>>>>>
> >>>>>> 4. Plus good UI messaging that clearly outlines the critical
> differences
> >>>>>> between these to products.
> >>>>>>
> >>>>>> Personally, I do not believe that having a code name will solve
the
> >>>> issue
> >>>>>> at hand, and may even make it worse. If the motivation is to
call
> out
> >>>>>> Sqoop2 as something "not-Sqoop", then perhaps we should discuss
the
> >>>>>> viability of this branch effort. If we conclude that it is not
> going to
> >>>>>> progress any further, we could call a vote on discontinuing
this
> effort
> >>>> and
> >>>>>> instead focusing on the main Sqoop1 branch alone.
> >>>>>>
> >>>>>> Hope you understand my point of view on this.
> >>>>>>
> >>>>>> Regards,
> >>>>>> Arvind Prabhakar
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Fri, Jul 25, 2014 at 10:53 AM, Gwen Shapira <
> gshapira@cloudera.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi Arvind,
> >>>>>>>
> >>>>>>> Here are few more threads from the last month where we had
to
> explain
> >>>>>>> Sqoop2 status or explain that you can't use "sqoop import"
with
> >>>>>>> Sqoop2, etc:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> http://mail-archives.apache.org/mod_mbox/sqoop-user/201407.mbox/%3CCA%2BP7NPNTFuPYqf74M5OFw4e9xKZm2ns%3DZ0ydkkuJ06Jcg31hnw%40mail.gmail.com%3E
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> http://mail-archives.apache.org/mod_mbox/sqoop-user/201407.mbox/%3CCAAJ8D%3D9Ho%3DYH7jdavNAb1gwz19Z5dapmS98yR71KmM5LsjCEVw%40mail.gmail.com%3E
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> http://mail-archives.apache.org/mod_mbox/sqoop-user/201407.mbox/%3CCAPwc21YtdgAm9jO3%2Bs0asBZ2JkG%3DVCp5PLpkO5xQuuBPKQGuTw%40mail.gmail.com%3E
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> http://mail-archives.apache.org/mod_mbox/sqoop-user/201406.mbox/%3CCAOrS3pxWuxL1X9Sb816N_o1Jd==gs9Ww6UjE2PO+FPaw7VHw1Q@mail.gmail.com%3E
> >>>>>>>
> >>>>>>> In addition, I noticed the problem when talking to users
in
> >>>>>>> conferences, customers, members of support team, etc (not
to
> mention
> >>>>>>> that I got confused personally when I started out.)
> >>>>>>> I didn't bring much evidence in my first email because I
thought
> there
> >>>>>>> was a wide consensus about the problem.
> >>>>>>>
> >>>>>>> I have several goals with the code-name:
> >>>>>>>
> >>>>>>> * We need to remove the impression that the new version
is like
> Sqoop
> >>>>>>> only better. It is only somewhat like Sqoop and will not
be
> strictly
> >>>>>>> better for many months yet.
> >>>>>>> * We need to clarify that this project is not even close
to
> production
> >>>>>>> quality.
> >>>>>>> * We need to make it easy for us to quickly figure out which
> version
> >>>>>>> the user is talking about. We also need to make it easy
for the
> users
> >>>>>>> to describe what they are using.
> >>>>>>> * We want to have fun :)
> >>>>>>>
> >>>>>>> I think the name Pelican Project will help with all goals:
> >>>>>>> - It is clearly not the same as Sqoop. So there's no existing
> >>>>>>> expectations on what will be supported.
> >>>>>>> - It is a "Project" and not a product yet.
> >>>>>>> - Sqoop and Pelican don't look or sound similar. No one
can expect
> to
> >>>>>>> use Sqoop by running "pelican-shell" or to use Pelican by
calling
> >>>>>>> "sqoop import".
> >>>>>>> - And a cute mascot will make every future presentation
and blog
> post
> >>>>>>> on the topic much more fun.
> >>>>>>>
> >>>>>>> You do bring up good points of concern:
> >>>>>>>
> >>>>>>> a) Existing releases: I disagree code-names for in-progress
> >>>>>>> development cause too much confusion. They seem fairly common
in
> the
> >>>>>>> software world.
> >>>>>>>
> >>>>>>>
> >>>>
> http://royal.pingdom.com/2010/05/27/the-developer-obsession-with-code-names-114-interesting-examples/
> >>>>>>>
> >>>>>>> b) "could impact the reproducibility of previous release
builds
> which
> >>>>>>> is not very good for the project."
> >>>>>>> This sounds fairly serious. Can you elaborate what you mean
by
> >>>>>>> reproducibility of release build?
> >>>>>>>
> >>>>>>> Gwen
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Jul 25, 2014 at 8:02 AM, Arvind Prabhakar <
> arvind@apache.org>
> >>>>>>> wrote:
> >>>>>>>> Hi Gwen,
> >>>>>>>>
> >>>>>>>> Other than the recent thread [1] on our user list, is
there any
> other
> >>>>>>>> precedent regarding the confusion this issue has caused?
If so, I
> >>>> would
> >>>>>>>> appreciate if you could point it out.
> >>>>>>>>
> >>>>>>>> Personally, I do agree that we ought to have a better
mechanism to
> >>>>>>>> communicate the completeness (or incompleteness) of
a release in
> >>>> order to
> >>>>>>>> ensure the users understand what benefits or drawbacks
they may
> get.
> >>>>>>>> Incidentally, this was the primary reason for numbering
the Sqoop2
> >>>>>>> release
> >>>>>>>> as 1.99.x, thereby indicating that the release is not
quite 2.0
> yet,
> >>>>>>> which
> >>>>>>>> seems to be not working as well as expected.
> >>>>>>>>
> >>>>>>>> One traditional way to alleviate this issue would be
to label the
> >>>> release
> >>>>>>>> alpha/beta etc. I prefer doing that instead of putting
a code
> name for
> >>>>>>> the
> >>>>>>>> release for a couple of reasons - a) we have already
made
> releases of
> >>>>>>>> Sqoop2 with the previous versioning scheme and hence
changing the
> name
> >>>>>>>> could cause more confusion; and b) renaming the branches
to the
> new
> >>>> name
> >>>>>>>> could impact the reproducibility of previous release
builds which
> is
> >>>> not
> >>>>>>>> very good for the project.
> >>>>>>>>
> >>>>>>>> Another alternative to consider would be to have very
clear
> messaging
> >>>> in
> >>>>>>>> the user-interface of Sqoop2 that it is still work in
progress
> and not
> >>>>>>>> considered at par with Sqoop1.
> >>>>>>>>
> >>>>>>>> [1] http://s.apache.org/TvD
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Arvind Prabhakar
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Jul 25, 2014 at 7:30 AM, Venkat Ranganathan
<
> >>>>>>>> vranganathan@hortonworks.com> wrote:
> >>>>>>>>
> >>>>>>>>> +1 for Pelican.   But documentation should not be
called The
> Pelican
> >>>>>>> Brief
> >>>>>>>>> :)
> >>>>>>>>>
> >>>>>>>>> Venkat
> >>>>>>>>>
> >>>>>>>>> On Thu, Jul 24, 2014 at 8:12 PM, Abraham Elmahrek
<
> abe@cloudera.com>
> >>>>>>>>> wrote:
> >>>>>>>>>> There's something about schlep (or schlepper)
that I'm having
> >>>> trouble
> >>>>>>>>>> resisting... but... +1 to Pelican.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Thu, Jul 24, 2014 at 7:18 PM, Jarek Jarcec
Cecho <
> >>>>>>> jarcec@apache.org>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> I’m obviously biased, but +1 to Pelican.
> >>>>>>>>>>>
> >>>>>>>>>>> Jarcec
> >>>>>>>>>>>
> >>>>>>>>>>> On Jul 24, 2014, at 7:06 PM, Martin, Nick
<NiMartin@pssd.com>
> >>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> +1 Pelican
> >>>>>>>>>>>>
> >>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>> From: Gwen Shapira [mailto:gshapira@cloudera.com]
> >>>>>>>>>>>> Sent: Thursday, July 24, 2014 9:51 PM
> >>>>>>>>>>>> To: dev@sqoop.apache.org
> >>>>>>>>>>>> Subject: Code name for Sqoop 2 (please
vote!)
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>> As you may have noticed on the user
list, Sqoop2 confuses the
> hell
> >>>>>>> out
> >>>>>>>>>>> of everyone.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Part of the problem is the name - Sqoop2
sounds newer and
> >>>> therefore
> >>>>>>>>>>> better. People expect better quality and
more features - which
> we
> >>>>>>> don't
> >>>>>>>>>>> deliver :(
> >>>>>>>>>>>>
> >>>>>>>>>>>> Therefore, I propose finding Sqoop2
a project code name. This
> way
> >>>>>>> it
> >>>>>>>>>>> will sound experimental and will not have
the number "2" next
> to
> >>>> it.
> >>>>>>>>>>>> We can use the code name to mark the
branches in the repo, the
> >>>>>>>>>>> documentation, the Hue frontend, etc. This
will prevent
> confusion
> >>>> as
> >>>>>>> the
> >>>>>>>>>>> name Sqoop will go back to refer to just
one project, and one
> that
> >>>>>>>>> actually
> >>>>>>>>>>> works.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Suggested names:
> >>>>>>>>>>>> Project Pelican (Based on the animal
on O'Reilly's Sqoop book)
> >>>>>>> Project
> >>>>>>>>>>> Schlep (Yiddish for "moving heavy package")
> >>>>>>>>>>>>
> >>>>>>>>>>>> Friends, contributors, committers and
PMC members - please
> respond
> >>>>>>>>> with
> >>>>>>>>>>> either:
> >>>>>>>>>>>> * Vote (+1) on one of the names above
> >>>>>>>>>>>> * Your own suggestion
> >>>>>>>>>>>>
> >>>>>>>>>>>> We'll be looking to close the vote by
August 1st (Next week).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Gwen
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> CONFIDENTIALITY NOTICE
> >>>>>>>>> NOTICE: This message is intended for the use of
the individual or
> >>>>>>> entity to
> >>>>>>>>> which it is addressed and may contain information
that is
> >>>> confidential,
> >>>>>>>>> privileged and exempt from disclosure under applicable
law. If
> the
> >>>>>>> reader
> >>>>>>>>> of this message is not the intended recipient, you
are hereby
> >>>> notified
> >>>>>>> that
> >>>>>>>>> any printing, copying, dissemination, distribution,
disclosure or
> >>>>>>>>> forwarding of this communication is strictly prohibited.
If you
> have
> >>>>>>>>> received this communication in error, please contact
the sender
> >>>>>>> immediately
> >>>>>>>>> and delete it from your system. Thank You.
> >>>>>>>>>
> >>>>>>>
> >>>>
> >>>>
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message