spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "肥肥" <19934...@qq.com>
Subject SparkException: env SPARK_YARN_APP_JAR is not set
Date Wed, 23 Apr 2014 10:05:26 GMT
I have a small program, which I can launch successfully by yarn client with yarn-standalon
mode. 

the command look like this: 
(javac javac -classpath .:jars/spark-assembly-0.9.1-hadoop2.2.0.jar LoadTest.java) 
(jar cvf loadtest.jar LoadTest.class) 
SPARK_JAR=assembly/target/scala-2.10/spark-assembly-0.9.1-hadoop2.2.0.jar ./bin/spark-class
org.apache.spark.deploy.yarn.Client --jar /opt/mytest/loadtest.jar --class LoadTest --args
yarn-standalone --num-workers 2 --master-memory 2g --worker-memory 2g --worker-cores 1 

the program LoadTest.java: 
public class LoadTest { 
    static final String USER = "root"; 
    public static void main(String[] args) { 
        System.setProperty("user.name", USER); 
        System.setProperty("HADOOP_USER_NAME", USER); 
        System.setProperty("spark.executor.memory", "7g"); 
        JavaSparkContext sc = new JavaSparkContext(args[0], "LoadTest", System.getenv("SPARK_HOME"),
JavaSparkContext.jarOfClass(LoadTest.class)); 
        String file = "file:/opt/mytest/123.data"; 
        JavaRDD<String> data1 = sc.textFile(file, 2); 
        long c1=data1.count(); 
        System.out.println("1============"+c1); 
    } 
} 

BUT due to my other pragram's need, I must have it run with command of "java". So I add “environment”
parameter to JavaSparkContext(). Followed is The ERROR I get: 
Exception in thread "main" org.apache.spark.SparkException: env SPARK_YARN_APP_JAR is not
set 
        at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:49)

        at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:125)

        at org.apache.spark.SparkContext.<init>(SparkContext.scala:200) 
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:100) 
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:93)

        at LoadTest.main(LoadTest.java:37) 

the program LoadTest.java: 
public class LoadTest { 

    static final String USER = "root"; 
    public static void main(String[] args) { 
        System.setProperty("user.name", USER); 
        System.setProperty("HADOOP_USER_NAME", USER); 
        System.setProperty("spark.executor.memory", "7g"); 

        Map<String, String> env = new HashMap<String, String>(); 
        env.put("SPARK_YARN_APP_JAR", "file:/opt/mytest/loadtest.jar"); 
        env.put("SPARK_WORKER_INSTANCES", "2" ); 
        env.put("SPARK_WORKER_CORES", "1"); 
        env.put("SPARK_WORKER_MEMORY", "2G"); 
        env.put("SPARK_MASTER_MEMORY", "2G"); 
        env.put("SPARK_YARN_APP_NAME", "LoadTest"); 
        env.put("SPARK_YARN_DIST_ARCHIVES", "file:/opt/test/spark-0.9.1-bin-hadoop1/assembly/target/scala-2.10/spark-assembly-0.9.1-hadoop2.2.0.jar");

        JavaSparkContext sc = new JavaSparkContext("yarn-client", "LoadTest", System.getenv("SPARK_HOME"),
JavaSparkContext.jarOfClass(LoadTest.class), env); 
        String file = "file:/opt/mytest/123.dna"; 
        JavaRDD<String> data1 = sc.textFile(file, 2);//.cache(); 

        long c1=data1.count(); 
        System.out.println("1============"+c1); 
    } 
} 

the command: 
javac -classpath .:jars/spark-assembly-0.9.1-hadoop2.2.0.jar LoadTest.java 
jar cvf loadtest.jar LoadTest.class 
nohup java -classpath .:jars/spark-assembly-0.9.1-hadoop2.2.0.jar LoadTest >> loadTest.log
2>&1 & 

What did I miss?? Or I did it in wrong way??
Mime
View raw message