phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From István Tóth <st...@cloudera.com.INVALID>
Subject Re: Moving Phoenix master to Hbase 2.2
Date Tue, 14 Jan 2020 09:13:16 GMT
Yes, the HBase API signatures change between versions, so we need to
compile each compat module against a specific HBase.

Whether I can define an internal compatibility API that is switchable at
run (startup) time without a performance hit remains to be seen.

István

On Tue, Jan 14, 2020 at 3:21 AM Josh Elser <elserj@apache.org> wrote:

> Agree that trying to wrangle branches is just too frustrating and
> error-prone.
>
> It would also be great if we could have a single Phoenix jar that works
> across HBase versions, but would not die on that hill :)
>
> On 12/20/19 5:04 AM, larsh@apache.org wrote:
> >   I said _provided_ they can be isolated easily :) (I meant it in the
> sense of assuming it's easy).
> > As I said though, Tephra has a similar problem and they did a really
> good job isolating HBase versions. We can learn from them. Sometimes they
> isolate the change only, and sometimes the class needs to be copied, but
> even then it's the one class that is copied, not another branch that needs
> to be kept in sync.
> >
> > This may also drive the desperately necessary refactoring of Phoenix to
> make these things easier to isolate, or to reduce the copying to a minimum.
> And we'd need to think through testing carefully.
> >
> > The branch per Phoenix and HBase version is too complex, IMHO. And the
> complex branch to HBase version mapping that Istvan outlines below confirms
> that.
> >
> > We should all take a brief look at the Tephra solution and see whether
> we can apply that. (And since Tephra is part of the fold now, perhaps
> someone can help there...?)
> > Cheers.
> > -- Lars
> >
> >      On Thursday, December 19, 2019, 8:34:15 PM GMT+1, Geoffrey Jacoby <
> gjacoby@gmail.com> wrote:
> >
> >   Lars,
> >
> > I'm curious why you say the differences are easily isolated -- many of
> the
> > core classes of Phoenix either directly inherit HBase classes or
> implement
> > HBase interfaces, and those can vary between minor versions. (See my
> above
> > example of a new coprocessor hook on BaseRegionObserver.)
> >
> > Geoffrey
> >
> > On Thu, Dec 19, 2019 at 10:54 AM larsh@apache.org <larsh@apache.org>
> wrote:
> >
> >>    Yep. The differences are pretty minimal - provided they can be
> isolated
> >> easily.
> >> Tephra might be a pretty good model. It supports various versions of
> HBase
> >> in a single branch and has similar issues as Phoenix (coprocessors,
> etc).
> >> -- Lars
> >>      On Thursday, December 19, 2019, 7:07:51 PM GMT+1, Josh Elser <
> >> elserj@apache.org> wrote:
> >>
> >>    To clarify, you think that compat modules are better than that
> >> separate-branches model in 4.x?
> >>
> >> On 12/18/19 11:29 AM, larsh@apache.org wrote:
> >>> This is really hard to follow.
> >>>
> >>> I think we should do the same with HBase dependencies in Phoenix that
> >> HBase does with Hadoop dependencies.
> >>>
> >>> That is:  We could have a maven module with the specific HBase version
> >> dependent code.
> >>> Btw. Tephra does the same... A module for HBase version specific code.
> >>> -- Lars
> >>>
> >>>        On Tuesday, December 17, 2019, 10:00:31 AM GMT+1, Istvan Toth <
> >> stoty@apache.org> wrote:
> >>>
> >>>    What do you think about tying the minor releases to Hbase minor
> releases
> >>> (not necessarily one-to-one)
> >>>
> >>> for example (provided 5.1 is 2020H1)
> >>>
> >>> 5.0.0 -> HB 2.0
> >>> 5.1.0 -> HB 2.2.2 (and whatever 2.1 is API compatible with it)
> >>> 5.1.x -> HB 2.2.x (treat as maintenance branch, no major new features)
> >>> 5.2.0 -> HB 2.3.0 (if released by that time)
> >>> 5.2.x -> HB 2.3.x (treat as maintenance branch, no major new features)
> >>> 5.3.0 -> HB 2.3.x (if there is no new major/minor Hbase release)
> >>> master -> latest released HBase version
> >>>
> >>> Alternatively, we could stick with the same HBase version for patch
> >>> releases that we used for the first minor release.
> >>>
> >>> This would limit the number of branches that we have to maintain in
> >>> parallel, while providing maintenance branches for older releases, and
> >>> timely-ish Phoenix releases.
> >>>
> >>> The drawback is that users of old HBase versions won't get the latest
> >>> features, on the other hand they can expect more polish.
> >>>
> >>> Istvan
> >>>
> >>> On Thu, Dec 12, 2019 at 8:05 PM Geoffrey Jacoby <gjacoby@apache.org>
> >> wrote:
> >>>
> >>>> Since HBase 2.0 is EOM'ed, I'm +1 for not worrying about 2.0.x
> >>>> compatibility with the 5.x branch going forward.
> >>>>
> >>>> Given how coupled Phoenix is to the implementation details of HBase
> >> though,
> >>>> I'm not sure trying to abstract those away to keep one Phoenix branch
> >> per
> >>>> HBase major version is practical, however. At the least, it would be
> >> really
> >>>> complex.
> >>>>
> >>>> For example, in the new year I plan to return to working on the change
> >> data
> >>>> capture and Phoenix-level replication features, both of which depend
> on
> >>>> WALKey interface changes and a new RegionObserver coprocessor hook
> >>>> introduced in HBASE-22622 and HBASE-22623. This was released in HBase
> >> 1.5
> >>>> and will be in the forthcoming HBase 2.3. While the HBase community
is
> >>>> discussing EOMing 1.3 right now, and maybe 1.4 will go in the medium
> >> term,
> >>>> I don't see all pre-2.3 branch-2's getting deprecated anytime soon.
> >>>>
> >>>> So there will be at least two significant features that can only exist
> >> in
> >>>> some but not all of our 4.x and 5.x branches.
> >>>>
> >>>> Geoffrey
> >>>>
> >>>> On Thu, Dec 12, 2019 at 8:21 AM Josh Elser <elserj@apache.org>
wrote:
> >>>>
> >>>>> As much as possible, I'd like to avoid us getting into another
> >> situation
> >>>>> with 5.x where we have multiple branches. My hope was/is that we
can
> >>>>> keep one Phoenix5 branch that works against an acceptable set of
> HBase
> >>>>> branches.
> >>>>>
> >>>>> To me, that acceptable set of HBase branches is _a_ 2.1 and 2.2
> >> release.
> >>>>> I don't think we need to support all 2.1.x or 2.2.x, nor do I think
> we
> >>>>> need to keep trying to maintain 2.0.x as it's already end of support
> by
> >>>>> the HBase community.
> >>>>>
> >>>>> Thanks for updating your PR. I'll add this to my review queue.
> >>>>>
> >>>>> On 12/12/19 1:52 AM, Istvan Toth wrote:
> >>>>>> Hi!
> >>>>>>
> >>>>>> I'd like to start a conversation about supporting HBase 2.2.
in the
> >>>>>> master branch.
> >>>>>>
> >>>>>> https://issues.apache.org/jira/browse/PHOENIX-5268 has a slightly
> out
> >>>> of
> >>>>>> date, but functional PR for HBase 2.2 support on master. (Please
> >> review
> >>>>>> and comment if you have the time, I'll try to update the PR
in the
> >> next
> >>>>>> few days)
> >>>>>>
> >>>>>> The reason that it is not a straightforward decision to merge
it is
> >>>> that
> >>>>>> applying that patch breaks compatibility with HBase 2.0.1, the
> current
> >>>>>> base.
> >>>>>>
> >>>>>> I can see the following outcomes:
> >>>>>>
> >>>>>> - Do nothing
> >>>>>> - Move master to HBase 2.2.2
> >>>>>> - Fork master to Hbase-2.0 and Hbase-2.2 branches
> >>>>>> - Build time compatibility modules
> >>>>>> - Run time compatibility modules
> >>>>>> - Something that I haven't thought of
> >>>>>>
> >>>>>>
> >>>>>> Doing nothing is obviously not a long term solution, as the
current
> >>>>>> master doesn't work with any of the currently supported HBase
> >> branches,
> >>>>>> but we may postpone the inevitable.
> >>>>>>
> >>>>>> Simply moving master to HBase 2.2 is the most attractive solution
> from
> >>>> a
> >>>>>> pure developer POV, but there may be other considerations.
> >>>>>>
> >>>>>> Having multiple masters for 2.0 and 2.2 is simple from a code
> >>>>>> perspective, but maintaining two branches is a non-trivial amount
of
> >>>>>> additional work. (See the 4.x situation)
> >>>>>>
> >>>>>> Moving the HBase version dependent stuff into a separate module,
and
> >>>>>> choosing at build time is not pretty from a code POV, but saves
us
> the
> >>>>>> hassle of maintaining multiple branches, while maintaining
> >>>> compatibility
> >>>>>> with multiple  HBase versions, and can handle future API changes
as
> >>>> well
> >>>>>> from a single branch. Doing something like this could have saved
us
> >> the
> >>>>>> effort of maintaining three separate 4.x branches.
> >>>>>>
> >>>>>> I feel that since Phoenix is closely timed to HBase, and requires
> >>>>>> cluster-wide HBase configuration to work anyway, handling the
> >> different
> >>>>>> HBase versions from the same binary/JAR is not worth the effort.
> >>>>>>
> >>>>>> Please share your thoughts!
> >>>>>>
> >>>>>> regards
> >>>>>> Istvan
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
> >
>


-- 
*István Tóth* | Sr. Software Engineer
t. (36) 70 283-1788
stoty@cloudera.com <https://www.cloudera.com>
[image: Cloudera] <https://www.cloudera.com/>
[image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
on LinkedIn] <https://www.linkedin.com/company/cloudera>
<https://www.cloudera.com/>
------------------------------

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message