tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hitesh Shah <hit...@apache.org>
Subject [DISCUSS] Publishing and releasing jars for different hadoop version dependencies
Date Thu, 26 Feb 2015 19:03:58 GMT
Hi folks, 

Chris raised a good point earlier in terms of publishing jars for use against different versions
of hadoop. For the most part, I think we have done well to ensure that the user-facing modules
are version agnostic but the same does not hold for other modules which are times are needed
by other applications for testing.

There aren’t really too many different options we can try.  The simplest option I can think
of is just to build tez against different versions of hadoop with the tez.version set to something
along the lines of “tez.version-hadoop.version”. This would imply having tez-api-0.6.0-hadoop2.4
or tez-api-0.6.0-hadoop26. For a usability point of view, depending on the option we pick,
users will need to switch their dependencies to point to an appropriate version based on what
version of hadoop they are using. For apps such as hive and pig, they will need to manage
picking a particular version of tez based on which hadoop profile they are building against.

Any other suggestions for publishing version dependent jars?

For binary releases, should we release only the minimal tarball? or both the minimal and full
tar balls? The full tarball is the recommended deployment model as it is more robust towards
compatibility on a changing cluster. It should work in most scenarios as long as the hadoop
client libraries that Tez depends on are compatible with the servers running on the cluster.

General questions for the community/past release managers: 
   - Should we retain the simple version ( i.e. plain only x.y.z ) when building against the
default version of hadoop as determined by Tez? This “default.version” will have a tendency
to evolve over time :) . These simple version jars would be in addition to the version specific
   - What versions of hadoop should we compile against? 2.2, 2.4 and 2.6 or 2.2,2.3,2.4,2.5,2.6
? Please note that I am ignoring the minor version so we should pick the latest version in
each line i.e. 2.2.1 over 2.2.0 if 2.2.1 exists. 
Any other comments? 

— Hitesh

View raw message