tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hitesh Shah <hit...@apache.org>
Subject Re: [DISCUSS] Publishing and releasing jars for different hadoop version dependencies
Date Thu, 05 Mar 2015 00:46:48 GMT
From an ASF perspective, verifiable releases are only source releases. The binaries are just
convenience artifacts that can also made available with a given release. Hence in terms of
supporting multiple hadoop versions, we do want to allow various users/distros to compile
Tez against their particular version of hadoop. 

From a run-time point of view , if Tez compiled against hadoop-2.6 is run on a 2.4 cluster,
it should work normally as long as acls are disabled ( via tez config tez.am.acls.enabled
). That said, there are probably some improvements that could be done to handle the case where
acls are enabled on a 2.4 cluster in a more cleaner manner.

thanks
— Hitesh

On Mar 4, 2015, at 9:21 AM, Chris K Wensel <chris@wensel.net> wrote:

> compile what against hadoop 2.4? Tez? Hopefully no one except Tez devs ever compile Tez
(once the apache committers offer up pre-built binaries, I only ever do for this reason).
> 
> if compiling application code against Tez and Hadoop 2.4, the jar won't come into play
unless running tests (so i believe).
> 
> I would then enhance option two to gracefully fail if -acls (the Manager) is not applicable
(on hadoop 2.4) but mistakenly included in the 2.4 classpath (testing app code against hadoop
2.4)
> 
> of course then this is really option 1 now with two jars.
> 
> ckw
> 
>> On Mar 2, 2015, at 3:05 PM, Hitesh Shah <hitesh@apache.org> wrote:
>> 
>> Thanks for the suggestions, Chris. Filed TEZ-2168 for this. 
>> 
>> At this point, I am inclined to follow option 2 mainly to retain the ability for
users to compile against hadoop 2.4. I am not sure if there is a simple and performant way
( without using reflection for all 2.6 specific calls ) to retain compile compatibility with
option 1.
>> 
>> Any other comments for other folks on this issue in general or on the 2 options that
Chris suggested? 
>> 
>> thanks
>> — Hitesh
>> 
>> 
>> On Feb 26, 2015, at 1:18 PM, Chris K Wensel <chris@wensel.net> wrote:
>> 
>>> The immediate issue is having two mutually exclusive artifacts: tez-yarn-timeline-history
and tez-yarn-timeline-history
>>> 
>>> outside of ATSHistoryACLPolicyManager, the code is identical. just the dependencies
are changed.
>>> 
>>> TezClient attempts to load this Manager, under the assumption if it exists, it
is running on hadoop 2.6. (running on 2.4 is fatal)
>>> 
>>> My recommendation would be never to change artifact names (or conditionally choose
them) inside of major releases, but accreting new, optional, ones as versions progress is
fine.
>>> 
>>> thus I would either:
>>> 
>>> create a single artifact tez-yarn-timeline-history compiled with a default dep
of hadoop 2.6, that includes the Manager. update the TezClient code to gracefully fail if
the Manager is not applicable (the runtime env is Hadoop 2.4).
>>> 
>>> or
>>> 
>>> offer tez-yarn-timeline-history-with-acls as an optional artifact for Hadoop
2.6 deployments, with the single Manager class in it, which in turn requires the tez-yarn-timeline-history
artifact -- which is sufficient for a 2.4 runtime. if the user provides the additional -with-acls
artifact, they are knowingly going to have problems on Hadoop 2.4.
>>> 
>>> I prefer the first as it keeps my build file simple. graceful degradation of
services per environment (with appropriate logging) is a well accepted practice.
>>> 
>>> and you can now test Tez across multiple versions Hadoop/Yarn at runtime (outside
of compile time).
>>> 
>>> we do this with Cascading, just simple build file modifications to verify binary
compatibility (vendors fork this repo to verify their distributions, and been known to find
critical bugs):
>>> 
>>> https://github.com/Cascading/cascading.compatibility
>>> 
>>> ckw
>>> 
>>>> On Feb 26, 2015, at 11:03 AM, Hitesh Shah <hitesh@apache.org> wrote:
>>>> 
>>>> Hi folks, 
>>>> 
>>>> Chris raised a good point earlier in terms of publishing jars for use against
different versions of hadoop. For the most part, I think we have done well to ensure that
the user-facing modules are version agnostic but the same does not hold for other modules
which are times are needed by other applications for testing.
>>>> 
>>>> There aren’t really too many different options we can try.  The simplest
option I can think of is just to build tez against different versions of hadoop with the tez.version
set to something along the lines of “tez.version-hadoop.version”. This would imply having
tez-api-0.6.0-hadoop2.4 or tez-api-0.6.0-hadoop26. For a usability point of view, depending
on the option we pick, users will need to switch their dependencies to point to an appropriate
version based on what version of hadoop they are using. For apps such as hive and pig, they
will need to manage picking a particular version of tez based on which hadoop profile they
are building against. 
>>>> 
>>>> Any other suggestions for publishing version dependent jars?
>>>> 
>>>> For binary releases, should we release only the minimal tarball? or both
the minimal and full tar balls? The full tarball is the recommended deployment model as it
is more robust towards compatibility on a changing cluster. It should work in most scenarios
as long as the hadoop client libraries that Tez depends on are compatible with the servers
running on the cluster.
>>>> 
>>>> General questions for the community/past release managers: 
>>>> - Should we retain the simple version ( i.e. plain only x.y.z ) when building
against the default version of hadoop as determined by Tez? This “default.version” will
have a tendency to evolve over time :) . These simple version jars would be in addition to
the version specific jars. 
>>>> - What versions of hadoop should we compile against? 2.2, 2.4 and 2.6 or
2.2,2.3,2.4,2.5,2.6 ? Please note that I am ignoring the minor version so we should pick the
latest version in each line i.e. 2.2.1 over 2.2.0 if 2.2.1 exists. 
>>>> 
>>>> Any other comments? 
>>>> 
>>>> thanks
>>>> — Hitesh
>>>> 
>>>> 
>>> 
>>> —
>>> Chris K Wensel
>>> chris@wensel.net
>>> 
>>> 
>>> 
>>> 
>> 
> 
> —
> Chris K Wensel
> chris@wensel.net
> 
> 
> 
> 


Mime
View raw message