spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Cozzi <alexco...@gmail.com>
Subject Re: testing 0.9.0-incubating and maven
Date Thu, 16 Jan 2014 22:12:36 GMT
Thanks for the help. I am doing progress, but I found I need to do a bit of fiddling with excluding
dependencies from spark in order to have mine take effect. As soon as I have a working pom
I will post here as an example. 

Alex Cozzi
alexcozzi@gmail.com
------------------------------------------------------
eBay is hiring! Check out our job openings
http://ebay.referrals.selectminds.com/?et=OlVHMHJl

On Jan 16, 2014, at 1:54 PM, Patrick Wendell <pwendell@gmail.com> wrote:

> Hey Alex,
> 
> Maven profiles only affect the Spark build itself. They do not
> transitively affect your own build.
> 
> Checkout the docs for how to deploy applications on yarn:
> http://spark.incubator.apache.org/docs/latest/running-on-yarn.html
> 
> When compiling your application, just should explicitly add the hadoop
> version you depend on to your own build (e.g. a hadoop-client
> dependency). Take a look at the example here where we show adding
> hadoop-client:
> 
> http://spark.incubator.apache.org/docs/latest/quick-start.html
> 
> When deploying Spark applications on YARN, you actually want to mark
> spark as a provided dependency in your application's maven and bundle
> your application as an assembly jar, then submit it with a Spark YARN
> bundle to a YARN cluster. The instructions are the same as they were
> in 0.8.1.
> 
> For the spark jar you want to submit to YARN, you can download the
> precompiled Spark one.
> 
> It might make sense to try this pipeline with 0.8.1 and get it working
> there. It sounds here more like you are dealing with getting the build
> set-up rather than a particular issue with the 0.9.0 RC.
> 
> - Patrick
> 
> On Thu, Jan 16, 2014 at 1:13 PM, Alex Cozzi <alexcozzi@gmail.com> wrote:
>> Hi Patrick,
>> thank you for testing. I think I found out what is wrong: I am trying to build my
own examples that also depend on another library which in turns depends on hadoop 2.2.
>> what was happening is that my library brings in hadoop 2.2, while spark depends on
hadoop 1.04 and then I think I get conflict versions of the classes.
>> 
>> A couple of things are not clear to me:
>> 
>> 1: do the published artifacts support YARN and hadoop 2.2 or will I need to make
my own build?
>> 2: if they do, how do I activate the profiles in my maven config? I tried mvn -Pyarn
compile but it does not work (maven says “[WARNING] The requested profile "yarn" could not
be activated because it does not exist.”)
>> 
>> 
>> essentially I would like to specify the spark dependencies as:
>> 
>> <dependencies>
>>                <dependency>
>>                        <groupId>org.scala-lang</groupId>
>>                        <artifactId>scala-library</artifactId>
>>                        <version>${scala.version}</version>
>>                </dependency>
>> 
>>                <dependency>
>>                        <groupId>org.apache.spark</groupId>
>>                        <artifactId>spark-core_${scala.tools.version}</artifactId>
>>                        <version>0.9.0-incubating</version>
>>                </dependency>
>> 
>> and tell maven to use the “yarn” profile for this dependency, but I do not seem
to be able to make it work.
>> Anybody has any suggestion?
>> 
>> Alex


Mime
View raw message