spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Saputra <henry.sapu...@gmail.com>
Subject Re: Important: Changes to Spark's build system on master branch
Date Wed, 21 Aug 2013 05:02:04 GMT
Hi Jey, just want to clarify that the changes happen to master branch in
the github and not the Apache git repository?

Thanks,

Henry


On Tue, Aug 20, 2013 at 8:39 PM, Jey Kottalam <jey@cs.berkeley.edu> wrote:

> The master branch of Spark has been updated with PR #838, which
> changes aspects of Spark's interface to Hadoop. This involved also
> making changes to Spark's build system as documented below. The
> documentation will be updated with this information shortly.
>
> Please feel free to reply to this thread with any questions or if you
> encounter any problems.
>
> -Jey
>
>
>
> When Building Spark
> ===============
>
> - General: The default version of Hadoop has been updated to 1.2.1 from
> 1.0.4.
>
> - General: You will probably need to perform an "sbt clean" or "mvn
> clean" to remove old build files. SBT users may also need to perform a
> "clean" when changing Hadoop versions (or at least delete the
> lib_managed directory).
>
> - SBT users: The version of Hadoop used can be specified by setting
> the SPARK_HADOOP_VERSION environment variable when invoking sbt, and
> YARN-enabled builds can be created by setting SPARK_WITH_YARN=true.
> Example:
>
>     # Using Hadoop 1.1.0 (a version of Hadoop without YARN)
>     SPARK_HADOOP_VERSION=1.1.0 ./sbt/sbt package assembly
>
>     # Using Hadoop 2.0.5-alpha (which is a YARN-based version of Hadoop)
>     SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_WITH_YARN=true ./sbt/sbt
> package assembly
>
> - Maven users: Set the Hadoop version built against by editing the
> "pom.xml" file in the root directory and changing the "hadoop.version"
> property (and, the "yarn.version" property if applicable). If you are
> building with YARN disabled, you no longer need to enable any Maven
> profiles (i.e. "-P" flags). To build with YARN enabled, use the
> "hadoop2-yarn" Maven profile. Example:
>
> - The "make-distribution.sh" script has been updated to take
> additional parameters to select the Hadoop version and enable YARN.
>
>
>
> When Writing Spark Applications
> ========================
>
>
> - Non-YARN users: If you wish to use HDFS, you will need to add the
> appropriate version of the "hadoop-client" artifact from the
> "org.apache.hadoop" group to your project.
>
>     SBT example:
>         // "force()" is required because "1.1.0" is less than Spark's
> default of "1.2.1"
>         "org.apache.hadoop" % "hadoop-client" % "1.1.0" force()
>
>     Maven example:
>         <dependency>
>           <groupId>org.apache.hadoop</groupId>
>           <artifactId>hadoop-client</artifactId>
>           <!-- the brackets are needed to tell Maven that this is a
> hard dependency on version "1.1.0" exactly -->
>           <version>[1.1.0]</version>
>         </dependency>
>
>
> - YARN users: You will now need to set SPARK_JAR to point to the
> spark-yarn assembly instead of the spark-core assembly previously
> used.
>
>   SBT Example:
>        SPARK_JAR=$PWD/yarn/target/spark-yarn-assembly-0.8.0-SNAPSHOT.jar \
>         ./run spark.deploy.yarn.Client \
>           --jar
> $PWD/examples/target/scala-2.9.3/spark-examples_2.9.3-0.8.0-SNAPSHOT.jar
> \
>           --class spark.examples.SparkPi --args yarn-standalone \
>           --num-workers 3 --worker-memory 2g --master-memory 2g
> --worker-cores 1
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message