spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yong Zhang <java8...@hotmail.com>
Subject Re: spark-submit config via file
Date Fri, 24 Mar 2017 13:18:36 GMT
Of course it is possible.


You can always to set any configurations in your application using API, instead of pass in
through the CLI.


val sparkConf = new SparkConf().setAppName(properties.get("appName")).set("master", properties.get("master")).set(xxx,
properties.get("xxx"))

Your error is your environment problem.

Yong
________________________________
From: , Roy <rp346@njit.edu>
Sent: Friday, March 24, 2017 7:38 AM
To: user
Subject: spark-submit config via file

Hi,

I am trying to deploy spark job by using spark-submit which has bunch of parameters like

spark-submit --class StreamingEventWriterDriver --master yarn --deploy-mode cluster --executor-memory
3072m --executor-cores 4 --files streaming.conf spark_streaming_2.11-assembly-1.0-SNAPSHOT.jar
-conf "streaming.conf"

I was looking a way to put all these flags in the file to pass to spark-submit to make my
spark-submitcommand simple like this

spark-submit --class StreamingEventWriterDriver --master yarn --deploy-mode cluster --properties-file
properties.conf --files streaming.conf spark_streaming_2.11-assembly-1.0-SNAPSHOT.jar -conf
"streaming.conf"

properties.conf has following contents


spark.executor.memory 3072m

spark.executor.cores 4


But I am getting following error


17/03/24 11:36:26 INFO Client: Use hdfs cache file as spark.yarn.archive for HDP, hdfsCacheFile:hdfs:///hdp/apps/2.6.0.0-403/spark2/spark2-hdp-yarn-archive.tar.gz

17/03/24 11:36:26 WARN AzureFileSystemThreadPoolExecutor: Disabling threads for Delete operation
as thread count 0 is <= 1

17/03/24 11:36:26 INFO AzureFileSystemThreadPoolExecutor: Time taken for Delete operation
is: 1 ms with threads: 0

17/03/24 11:36:27 INFO Client: Deleted staging directory wasb://abc@abc.blob.core.windows.net/user/sshuser/.sparkStaging/application_1488402758319_0492<http://abc@abc.blob.core.windows.net/user/sshuser/.sparkStaging/application_1488402758319_0492>

Exception in thread "main" java.io.IOException: Incomplete HDFS URI, no host: hdfs:///hdp/apps/2.6.0.0-403/spark2/spark2-hdp-yarn-archive.tar.gz

        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:154)

        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2791)

        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)

        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2825)

        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2807)

        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)

        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)

        at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:364)

        at org.apache.spark.deploy.yarn.Client.org<http://org.apache.spark.deploy.yarn.Client.org>$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:480)

        at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:552)

        at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:881)

        at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:170)

        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1218)

        at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1277)

        at org.apache.spark.deploy.yarn.Client.main(Client.scala)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:498)

        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:745)

        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)

        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)

        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)

        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

17/03/24 11:36:27 INFO MetricsSystemImpl: Stopping azure-file-system metrics system...

Anyone know is this is even possible ?


Thanks...

Roy

Mime
View raw message