spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "nirav patel (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-9515) Creating JavaSparkContext with yarn-cluster mode throws NPE
Date Wed, 05 Aug 2015 23:31:04 GMT

     [ https://issues.apache.org/jira/browse/SPARK-9515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

nirav patel updated SPARK-9515:
-------------------------------
    Description: 
I have spark application that runs agains YARN cluster. I run spark application as part of
my web application. I can't use spark-submit script. Way I run it is `java -cp myApp.jar com.myapp.Application`
which in turn initiate JavaSparkContext. It used to work with spark 1.0.2 and standalone cluster
but now with 1.3.1 and yarn its failing.

Caused by: java.lang.NullPointerException
	at org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:580)
	at org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)

EDIT:
I got it working with yarn-client mode however I want to test it out with yarn-cluster mode
as well.
Application design is, we create singleton SparkContext object and preload few RDDs in memory
when our spring-boot application(tomcat container) starts. That allows us to submit subsequent
spark jobs without overhead of creating new sparkContext and RDDs. It performs excellent for
our SLA. We are serving real-time GLM in ms with that. I hope this is a reason enough why
we can't use spark-submit script to submit a job.

Code is pretty simple. This is how we create sparkContext

SparkConf conf = new SparkConf().setAppName(appName.toString()).setMaster("yarn-client");
conf.set("spark.eventLog.enabled", "true");
conf.set("spark.executor.extraClassPath", "/opt/mapr/hbase/hbase-0.98.12/lib/*");
conf.set("spark.cores.max", sparkCoreMax);
conf.set("spark.executor.memory", sparkExecMem);
conf.set("spark.executor.extraJavaOptions", executorJavaOPts);
conf.set("spark.akka.threads", sparkDriverThreads);
JavaSparkContext sparkContext = new JavaSparkContext(conf);

This is how we actually run sprig-boot app.
java -Dloader.path=myspringbootapp.jar,/spark/spark-1.3.1/lib,/opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop,/opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/yarn
-XX:PermSize=512m -XX:MaxPermSize=512m -Xms1024m -jar myspringbootapp.jar

  was:
I have spark application that runs agains YARN cluster. I run spark application as part of
my web application. I can't use spark-submit script. Way I run it is `java -cp myApp.jar com.myapp.Application`
which in turn initiate JavaSparkContext. It used to work with spark 1.0.2 and standalone cluster
but now with 1.3.1 and yarn its failing.

Caused by: java.lang.NullPointerException
	at org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:580)
	at org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)

EDIT:

Application is design is, we create singleton SparkContext object and preload few RDDs in
memory when our spring-boot application(tomcat container) starts. 



> Creating JavaSparkContext with yarn-cluster mode throws NPE
> -----------------------------------------------------------
>
>                 Key: SPARK-9515
>                 URL: https://issues.apache.org/jira/browse/SPARK-9515
>             Project: Spark
>          Issue Type: Bug
>          Components: Java API
>    Affects Versions: 1.3.1
>            Reporter: nirav patel
>
> I have spark application that runs agains YARN cluster. I run spark application as part
of my web application. I can't use spark-submit script. Way I run it is `java -cp myApp.jar
com.myapp.Application` which in turn initiate JavaSparkContext. It used to work with spark
1.0.2 and standalone cluster but now with 1.3.1 and yarn its failing.
> Caused by: java.lang.NullPointerException
> 	at org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:580)
> 	at org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32)
> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
> 	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
> EDIT:
> I got it working with yarn-client mode however I want to test it out with yarn-cluster
mode as well.
> Application design is, we create singleton SparkContext object and preload few RDDs in
memory when our spring-boot application(tomcat container) starts. That allows us to submit
subsequent spark jobs without overhead of creating new sparkContext and RDDs. It performs
excellent for our SLA. We are serving real-time GLM in ms with that. I hope this is a reason
enough why we can't use spark-submit script to submit a job.
> Code is pretty simple. This is how we create sparkContext
> SparkConf conf = new SparkConf().setAppName(appName.toString()).setMaster("yarn-client");
> conf.set("spark.eventLog.enabled", "true");
> conf.set("spark.executor.extraClassPath", "/opt/mapr/hbase/hbase-0.98.12/lib/*");
> conf.set("spark.cores.max", sparkCoreMax);
> conf.set("spark.executor.memory", sparkExecMem);
> conf.set("spark.executor.extraJavaOptions", executorJavaOPts);
> conf.set("spark.akka.threads", sparkDriverThreads);
> JavaSparkContext sparkContext = new JavaSparkContext(conf);
> This is how we actually run sprig-boot app.
> java -Dloader.path=myspringbootapp.jar,/spark/spark-1.3.1/lib,/opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop,/opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/yarn
-XX:PermSize=512m -XX:MaxPermSize=512m -Xms1024m -jar myspringbootapp.jar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message