spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: ClassNotDefException when using spark-submit with multiple jars and files located on HDFS
Date Wed, 10 Jun 2015 04:47:55 GMT
I am not sure they work with HDFS pathes. You may want to look at the
source code. Alternatively you can create a "fat" jar containing all jars
(let your build tool set correctly METAINF). This always works.

Le mer. 10 juin 2015 à 6:22, Dong Lei <donglei@microsoft.com> a écrit :

>  Thanks So much!
>
>
>
> I did put sleep on my code to have the UI available.
>
>
>
> Now from the UI, I can see:
>
> ·         In the “SparkProperty” Section,  the spark.jars and spark.files
> are set as what I want.
>
> ·         In the “Classpath Entries” Section, my jars and files paths are
> there(with a HDFS path)
>
>
>
> And I check the HTTP file server directory, the stuctrue is like:
>
>      D:\data\temp
>
>                           \ --spark-UUID
>
>                                \-- httpd-UUID
>
>                                     \jars [*empty*]
>
>                                     \files [*empty*]
>
>
>
> So I guess the files and jars and not properly downloaded from HDFS to
> these folders?
>
>
>
> I’m using standalone mode.
>
>
>
> Any ideas?
>
>
>
> Thanks
>
> Dong Lei
>
>
>
> *From:* Akhil Das [mailto:akhil@sigmoidanalytics.com]
> *Sent:* Tuesday, June 9, 2015 4:46 PM
>
>
> *To:* Dong Lei
> *Cc:* user@spark.apache.org
> *Subject:* Re: ClassNotDefException when using spark-submit with multiple
> jars and files located on HDFS
>
>
>
> You can put a Thread.sleep(100000) in the code to have the UI available
> for quiet some time. (Put it just before starting any of your
> transformations) Or you can enable the spark history server
> <https://spark.apache.org/docs/latest/monitoring.html> too. I believe
> --jars
> <https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management>
> would download the dependency jars on all your worker machines (can be
> found in spark work dir of your application along with stderr stdout files).
>
>
>   Thanks
>
> Best Regards
>
>
>
> On Tue, Jun 9, 2015 at 1:29 PM, Dong Lei <donglei@microsoft.com> wrote:
>
>  Thanks Akhil:
>
>
>
> The driver fails so fast to get a look at 4040. Is there any other way to
> see the download and ship process of the files?
>
>
>
> Is driver supposed to download these jars from HDFS to some location, then
> ship them to excutors?
>
> I can see from log that the driver downloaded the application jar but not
> the other jars specified by “—jars”.
>
>
>
> Or I misunderstand the usage of “--jars”, and the jars should be already
> in every worker, driver will not download them?
>
> Is there some useful docs?
>
>
>
> Thanks
>
> Dong Lei
>
>
>
>
>
> *From:* Akhil Das [mailto:akhil@sigmoidanalytics.com]
> *Sent:* Tuesday, June 9, 2015 3:24 PM
> *To:* Dong Lei
> *Cc:* user@spark.apache.org
> *Subject:* Re: ClassNotDefException when using spark-submit with multiple
> jars and files located on HDFS
>
>
>
> Once you submits the application, you can check in the driver UI (running
> on port 4040) Environment Tab to see whether those jars you added got
> shipped or not. If they are shipped and still you are getting NoClassDef
> exceptions then it means that you are having a jar conflict which you can
> resolve by putting the jar with the class in it on the top of your
> classpath.
>
>
>   Thanks
>
> Best Regards
>
>
>
> On Tue, Jun 9, 2015 at 9:05 AM, Dong Lei <donglei@microsoft.com> wrote:
>
>  Hi, spark-users:
>
>
>
> I’m using spark-submit to submit multiple jars and files(all in HDFS) to
> run a job, with the following command:
>
>
>
> Spark-submit
>
>   --class myClass
>
>  --master spark://localhost:7077/
>
>   --deploy-mode cluster
>
>   --jars hdfs://localhost/1.jar, hdfs://localhost/2.jar
>
>   --files hdfs://localhost/1.txt, hdfs://localhost/2.txt
>
>  hdfs://localhost/main.jar
>
>
>
> the stderr in the driver showed java.lang.ClassNotDefException for a class
> in 1.jar.
>
>
>
> I checked the log that spark has added these jars:
>
>      INFO SparkContext: Added JAR hdfs:// …1.jar
>
>      INFO SparkContext: Added JAR hdfs:// …2.jar
>
>
>
> In the folder of the driver, I only saw the main.jar is copied to that
> place, *but  the other jars and files were not there*
>
>
>
> Could someone explain *how should I pass the jars and files* needed by
> the main jar to spark?
>
>
>
> If my class in main.jar refer to these files with a relative path, *will
> spark copy these files into one folder*?
>
>
>
> BTW, my class works in a client mode with all jars and files in local.
>
>
>
> Thanks
>
> Dong Lei
>
>
>
>
>

Mime
View raw message