spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shivaram Venkataraman <shiva...@eecs.berkeley.edu>
Subject Re: spark-ec2 default to Hadoop 2
Date Sun, 01 Mar 2015 23:14:23 GMT
One reason I wouldn't change the default is that the Hadoop 2 launched by
spark-ec2 is not a full Hadoop 2 distribution -- Its more of a hybrid
Hadoop version built using CDH4 (it uses HDFS 2, but not YARN AFAIK).

Also our default Hadoop version in the Spark build is still 1.0.4 [1], so
it makes sense to stick to that in spark-ec2 as well ?

[1] https://github.com/apache/spark/blob/master/pom.xml#L122

Thanks
Shivaram

On Sun, Mar 1, 2015 at 2:59 PM, Nicholas Chammas <nicholas.chammas@gmail.com
> wrote:

>
> https://github.com/apache/spark/blob/fd8d283eeb98e310b1e85ef8c3a8af9e547ab5e0/ec2/spark_ec2.py#L162-L164
>
> Is there any reason we shouldn't update the default Hadoop major version in
> spark-ec2 to 2?
>
> Nick
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message