spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fengyun RAO <raofeng...@gmail.com>
Subject Re: --jars works in "yarn-client" but not "yarn-cluster" mode, why?
Date Mon, 18 May 2015 03:49:10 GMT
Thanks, Wilfred.

The problem is, the jar "/opt/cloudera/parcels/CD
H/lib/hbase/lib/htrace-core-3.1.0-incubating.jar"
is on every node in the cluster, since we installed CDH 5.4.

thus no matter we run on client or cluster, the driver has access to the
jar.

What's more, the driver does not depend on the jar, it is the executor that
throws the "ClassNotFoundException"


2015-05-18 6:53 GMT+08:00 Wilfred Spiegelenburg <wspiegelenburg@cloudera.com
>:

> When you run the driver in the cluster the application really runs from
> the cluster and the client goes away. If the driver does not have access to
> the jars, i.e. if they are not on the cluster available somewhere, this
> will happen.
> If you run the driver on the client the driver has access to the jars
> there. Unless you have copied the jars onto the cluster it will not work.
> That is what SPARK-5377 is all about.
>
> Wilfred
>
> On 15/05/2015 00:37, Fengyun RAO wrote:
>
>> thanks, Wilfred.
>>
>> In our program, the "htrace-core-3.1.0-incubating.jar" dependency is
>> only required in the executor, not in the driver.
>> while in both "yarn-client" and "yarn-cluster", the executor runs in
>> cluster.
>>
>> and it's clearly in "yarn-cluster" mode, the jar IS in
>> "spark.yarn.secondary.jars", but still throws ClassNotFoundException
>>
>> 2015-05-14 18:52 GMT+08:00 Wilfred Spiegelenburg
>> <wspiegelenburg@cloudera.com <mailto:wspiegelenburg@cloudera.com>>:
>>
>>     In the cluster the driver runs in the cluster and not locally in the
>>     spark-submit JVM. This changes what is available on your classpath.
>>     It looks like you are running into a similar situation as described
>>     in SPARK-5377.
>>
>>     Wilfred
>>
>>     On 14/05/2015 13:47, Fengyun RAO wrote:
>>
>>         I look into the "Environment" in both modes.
>>
>>         yarn-client:
>>         spark.jars
>>
>> local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,file:/home/xxx/my-app.jar
>>
>>         yarn-cluster:
>>         spark.yarn.secondary.jars
>>
>> local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar
>>
>>         I wonder why htrace exists in "spark.yarn.secondary.jars" but
>>         still not
>>         found in URLClassLoader.
>>
>>         I tried both "local" and "file" mode for the jar, still the same
>>         error.
>>
>>
>>         2015-05-14 11:37 GMT+08:00 Fengyun RAO <raofengyun@gmail.com
>>         <mailto:raofengyun@gmail.com>
>>         <mailto:raofengyun@gmail.com <mailto:raofengyun@gmail.com>>>:
>>
>>
>>
>>              Hadoop version: CDH 5.4.
>>
>>              We need to connect to HBase, thus need extra
>>
>>
>> "/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar"
>>              dependency.
>>
>>              It works in yarn-client mode:
>>              "spark-submit --class xxx.xxx.MyApp --master yarn-client
>>              --num-executors 10 --executor-memory 10g --jars
>>
>>
>> /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar
>>              my-app.jar /input /output"
>>
>>              However, if we change "yarn-client" to "yarn-cluster', it
>>         throws an
>>              ClassNotFoundException (actually the class exists in
>>              htrace-core-3.1.0-incubating.jar):
>>
>>              Caused by: java.lang.NoClassDefFoundError:
>>         org/apache/htrace/Trace
>>                  at
>>
>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218)
>>                  at
>>
>> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481)
>>                  at
>>
>> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>>                  at
>>
>> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86)
>>                  at
>>
>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850)
>>                  at
>>
>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:635)
>>                  ... 21 more
>>              Caused by: java.lang.ClassNotFoundException:
>>         org.apache.htrace.Trace
>>                  at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>                  at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>                  at java.security.AccessController.doPrivileged(Native
>>         Method)
>>                  at
>>         java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>                  at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>                  at
>>         sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>                  at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>
>>
>>              Why --jars doesn't work in yarn-cluster mode? How to add
>>         extra dependency in "yarn-cluster" mode?
>>
>>
>>
>>         --
>>
>>         ---
>>         You received this message because you are subscribed to the Google
>>         Groups "CDH Users" group.
>>         To unsubscribe from this group and stop receiving emails from
>>         it, send
>>         an email to cdh-user+unsubscribe@cloudera.org
>>         <mailto:cdh-user%2Bunsubscribe@cloudera.org>
>>         <mailto:cdh-user+unsubscribe@cloudera.org
>>         <mailto:cdh-user%2Bunsubscribe@cloudera.org>>.
>>         For more options, visit
>>         https://groups.google.com/a/cloudera.org/d/optout.
>>
>>
>>     --
>>     Wilfred Spiegelenburg
>>     Backline Customer Operations Engineer
>>     YARN/MapReduce/Spark
>>
>>     http://www.cloudera.com
>>     --
>>     http://five.sentenc.es
>>
>>     --
>>
>>     --- You received this message because you are subscribed to the
>>     Google Groups "CDH Users" group.
>>     To unsubscribe from this group and stop receiving emails from it,
>>     send an email to cdh-user+unsubscribe@cloudera.org
>>     <mailto:cdh-user%2Bunsubscribe@cloudera.org>.
>>     For more options, visit
>>     https://groups.google.com/a/cloudera.org/d/optout.
>>
>>
>> --
>>
>> ---
>> You received this message because you are subscribed to the Google
>> Groups "CDH Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send
>> an email to cdh-user+unsubscribe@cloudera.org
>> <mailto:cdh-user+unsubscribe@cloudera.org>.
>> For more options, visit https://groups.google.com/a/cloudera.org/d/optout
>> .
>>
>
> --
> Wilfred Spiegelenburg
> Backline Customer Operations Engineer
> YARN/MapReduce/Spark
>
> http://www.cloudera.com
> --
> http://five.sentenc.es
>
> --
>
> --- You received this message because you are subscribed to the Google
> Groups "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscribe@cloudera.org.
> For more options, visit https://groups.google.com/a/cloudera.org/d/optout.
>

Mime
View raw message