spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghav Shankar <raghav0110...@gmail.com>
Subject Re: Submitting Spark Applications using Spark Submit
Date Wed, 17 Jun 2015 04:04:46 GMT
To clarify, I am using the spark standalone cluster.

On Tuesday, June 16, 2015, Yanbo Liang <ybliang8@gmail.com> wrote:

> If you run Spark on YARN, the simplest way is replace the
> $SPARK_HOME/lib/spark-****.jar with your own version spark jar file and run
> your application.
> The spark-submit script will upload this jar to YARN cluster automatically
> and then you can run your application as usual.
> It does not care about which version of Spark in your YARN cluster.
>
> 2015-06-17 10:42 GMT+08:00 Raghav Shankar <raghav0110.cs@gmail.com
> <javascript:_e(%7B%7D,'cvml','raghav0110.cs@gmail.com');>>:
>
>> The documentation says spark.driver.userClassPathFirst can only be used
>> in cluster mode. Does this mean I have to set the --deploy-mode option
>> for spark-submit to cluster? Or can I still use the default client? My
>> understanding is that even in the default deploy mode, spark still uses
>> the slave machines I have on ec2.
>>
>> Also, the spark.driver.extraLibraryPath property mentions that I can
>> provide a path for special libraries on the spark-submit command line
>> options. Do my jar files in this path have to be the same name as the jar
>> used by spark, or is it intelligent enough to identify that two jars are
>> supposed to be the same thing? If they are supposed to be the same name,
>> how can I find out the name I should use for my jar? Eg: If I just name my
>> modified spark-core jar as spark.jar and put in a lib folder and provide
>> the path of the folder to spark-submit would that be enough to tell Spark
>> to use that spark-core jar instead of the default?
>>
>> Thanks,
>> Raghav
>>
>> On Jun 16, 2015, at 7:19 PM, Will Briggs <wrbriggs@gmail.com
>> <javascript:_e(%7B%7D,'cvml','wrbriggs@gmail.com');>> wrote:
>>
>> If this is research-only, and you don't want to have to worry about
>> updating the jars installed by default on the cluster, you can add your
>> custom Spark jar using the "spark.driver.extraLibraryPath" configuration
>> property when running spark-submit, and then use the experimental "
>> spark.driver.userClassPathFirst" config to force it to use yours.
>>
>> See here for more details and options:
>> https://spark.apache.org/docs/1.4.0/configuration.html
>>
>> On June 16, 2015, at 10:12 PM, Raghav Shankar <raghav0110.cs@gmail.com
>> <javascript:_e(%7B%7D,'cvml','raghav0110.cs@gmail.com');>> wrote:
>>
>> I made the change so that I could implement top() using treeReduce(). A
>> member on here suggested I make the change in RDD.scala to accomplish that.
>> Also, this is for a research project, and not for commercial use.
>>
>> So, any advice on how I can get the spark submit to use my custom built
>> jars would be very useful.
>>
>> Thanks,
>> Raghav
>>
>> On Jun 16, 2015, at 6:57 PM, Will Briggs <wrbriggs@gmail.com
>> <javascript:_e(%7B%7D,'cvml','wrbriggs@gmail.com');>> wrote:
>>
>> In general, you should avoid making direct changes to the Spark source
>> code. If you are using Scala, you can seamlessly blend your own methods on
>> top of the base RDDs using implicit conversions.
>>
>> Regards,
>> Will
>>
>> On June 16, 2015, at 7:53 PM, raggy <raghav0110.cs@gmail.com
>> <javascript:_e(%7B%7D,'cvml','raghav0110.cs@gmail.com');>> wrote:
>>
>> I am trying to submit a spark application using the command line. I used
>> the
>> spark submit command for doing so. I initially setup my Spark application
>> on
>> Eclipse and have been making changes on there. I recently obtained my own
>> version of the Spark source code and added a new method to RDD.scala. I
>> created a new spark core jar using mvn, and I added it to my eclipse build
>> path. My application ran perfectly fine.
>>
>> Now, I would like to submit it through the command line. I submitted my
>> application like this:
>>
>> bin/spark-submit --master local[2] --class "SimpleApp"
>> /Users/XXX/Desktop/spark2.jar
>>
>> The spark-submit command is within the spark project that I modified by
>> adding new methods.
>> When I do so, I get this error:
>>
>> java.lang.NoSuchMethodError:
>> org.apache.spark.rdd.RDD.treeTop(ILscala/math/Ordering;)Ljava/lang/Object;
>> at SimpleApp$.main(SimpleApp.scala:12)
>> at SimpleApp.main(SimpleApp.scala)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at
>>
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
>> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>> When I use spark submit, where does the jar come from? How do I make sure
>> it
>> uses the jars that have built?
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Submitting-Spark-Applications-using-Spark-Submit-tp23352.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> <javascript:_e(%7B%7D,'cvml','user-unsubscribe@spark.apache.org');>
>> For additional commands, e-mail: user-help@spark.apache.org
>> <javascript:_e(%7B%7D,'cvml','user-help@spark.apache.org');>
>>
>>
>>
>>
>

Mime
View raw message