spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Konwinski <andykonwin...@gmail.com>
Subject Re: Important: Changes to Spark's build system on master branch
Date Wed, 21 Aug 2013 06:36:01 GMT
Hey Jey,

I'd just like to add that you can also run hadoop2 without modifying the
pom.xml file by passing the hadoop.version property at the command line
like this:

mvn -Dhadoop.version=2.0.0-mr1-cdh4.1.2 clean verify

Also, when you mentioned building with Maven in your instructions I think
you forgot to finish writing out your example for activating the yarn
profile, which I think would be something like:

mvn -Phadoop2-yarn clean verify

...right?

BTW, I've set up the AMPLab Jenkins Spark Maven Hadoop2 project to build
using the new options
https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-Hadoop2/

Andy

On Tue, Aug 20, 2013 at 8:39 PM, Jey Kottalam <jey@cs.berkeley.edu> wrote:

> The master branch of Spark has been updated with PR #838, which
> changes aspects of Spark's interface to Hadoop. This involved also
> making changes to Spark's build system as documented below. The
> documentation will be updated with this information shortly.
>
> Please feel free to reply to this thread with any questions or if you
> encounter any problems.
>
> -Jey
>
>
>
> When Building Spark
> ===============
>
> - General: The default version of Hadoop has been updated to 1.2.1 from
> 1.0.4.
>
> - General: You will probably need to perform an "sbt clean" or "mvn
> clean" to remove old build files. SBT users may also need to perform a
> "clean" when changing Hadoop versions (or at least delete the
> lib_managed directory).
>
> - SBT users: The version of Hadoop used can be specified by setting
> the SPARK_HADOOP_VERSION environment variable when invoking sbt, and
> YARN-enabled builds can be created by setting SPARK_WITH_YARN=true.
> Example:
>
>     # Using Hadoop 1.1.0 (a version of Hadoop without YARN)
>     SPARK_HADOOP_VERSION=1.1.0 ./sbt/sbt package assembly
>
>     # Using Hadoop 2.0.5-alpha (which is a YARN-based version of Hadoop)
>     SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_WITH_YARN=true ./sbt/sbt
> package assembly
>
> - Maven users: Set the Hadoop version built against by editing the
> "pom.xml" file in the root directory and changing the "hadoop.version"
> property (and, the "yarn.version" property if applicable). If you are
> building with YARN disabled, you no longer need to enable any Maven
> profiles (i.e. "-P" flags). To build with YARN enabled, use the
> "hadoop2-yarn" Maven profile. Example:
>
> - The "make-distribution.sh" script has been updated to take
> additional parameters to select the Hadoop version and enable YARN.
>
>
>
> When Writing Spark Applications
> ========================
>
>
> - Non-YARN users: If you wish to use HDFS, you will need to add the
> appropriate version of the "hadoop-client" artifact from the
> "org.apache.hadoop" group to your project.
>
>     SBT example:
>         // "force()" is required because "1.1.0" is less than Spark's
> default of "1.2.1"
>         "org.apache.hadoop" % "hadoop-client" % "1.1.0" force()
>
>     Maven example:
>         <dependency>
>           <groupId>org.apache.hadoop</groupId>
>           <artifactId>hadoop-client</artifactId>
>           <!-- the brackets are needed to tell Maven that this is a
> hard dependency on version "1.1.0" exactly -->
>           <version>[1.1.0]</version>
>         </dependency>
>
>
> - YARN users: You will now need to set SPARK_JAR to point to the
> spark-yarn assembly instead of the spark-core assembly previously
> used.
>
>   SBT Example:
>        SPARK_JAR=$PWD/yarn/target/spark-yarn-assembly-0.8.0-SNAPSHOT.jar \
>         ./run spark.deploy.yarn.Client \
>           --jar
> $PWD/examples/target/scala-2.9.3/spark-examples_2.9.3-0.8.0-SNAPSHOT.jar
> \
>           --class spark.examples.SparkPi --args yarn-standalone \
>           --num-workers 3 --worker-memory 2g --master-memory 2g
> --worker-cores 1
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message