spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Wendell <pwend...@gmail.com>
Subject Experience using binary packages on various Hadoop distros
Date Tue, 24 Mar 2015 19:05:03 GMT
Hey All,

For a while we've published binary packages with different Hadoop
client's pre-bundled. We currently have three interfaces to a Hadoop
cluster (a) the HDFS client (b) the YARN client (c) the Hive client.

Because (a) and (b) are supposed to be backwards compatible
interfaces. My working assumption was that for the most part (modulo
Hive) our packages work with *newer* Hadoop versions. For instance,
our Hadoop 2.4 package should work with HDFS 2.6 and YARN 2.6.
However, I have heard murmurings that these are not compatible in
practice.

So I have three questions I'd like to put out to the community:

1. Have people had difficulty using 2.4 packages with newer Hadoop
versions? If so, what specific incompatibilities have you hit?
2. Have people had issues using our binary Hadoop packages in general
with commercial or Apache Hadoop distro's, such that you have to build
from source?
3. How would people feel about publishing a "bring your own Hadoop"
binary, where you are required to point us to a local Hadoop
distribution by setting HADOOP_HOME? This might be better for ensuring
full compatibility:
https://issues.apache.org/jira/browse/SPARK-6511

- Patrick

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message