mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From BahaaEddin AlAila <bahaelai...@gmail.com>
Subject Re: Confusion regarding Samsara's configuration
Date Thu, 04 Feb 2016 17:56:46 GMT
Sorry for taking long to reply

Here's the original stack trace
it happens directly after spark-submit
btw the code is just bunch of imports, and mahoutSparkContext in the main
function, that's it.

bahaa@sparkserver:~/nystrom/samsara$ spark-submit --master "local[*]"
target/scala-2.10/double-nystrom-method_2.10-1.0.jar
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/mahout/sparkbindings/package$
        at NystromSamsara$.main(NystromSamsara.scala:32)
        at NystromSamsara.main(NystromSamsara.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException:
org.apache.mahout.sparkbindings.package$
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        ... 11 more


However I got it to work with a workaround, supplying the mahout jars with
the --jars argument to spark-submit.
Which is not the right thing to do (shipping mahout jars with the job jar)
so basically I have this line in my .bashrc

export
MAHOUT_JARS="/home/bahaa/mahout/mahout-integration-0.11.1.jar,/home/bahaa/mahout/mahout-math-0.11.1.jar,/home/bahaa/mahout/mahout-math-scala_2.10-0.11.1.jar,/home/bahaa/mahout/mahout-hdfs-0.11.1.jar,/home/bahaa/mahout/mahout-spark-shell_2.10-0.11.1.jar,/home/bahaa/mahout/mahout-spark_2.10-0.11.1.jar,/home/bahaa/mahout/mahout-spark_2.10-0.11.1-dependency-reduced.jar,/home/bahaa/mahout/mahout-mr-0.11.1.jar"

and now I do: $spark-submit --master "local[*]" --jars $MAHOUT_JARS
target/scala-2.10/double-nystrom-method_2.10-1.0.jar

I also tried the args spark.executor.extraClassPath and
spark.driver.extraClassPath, like this:
 spark-submit --master "local[*]" --conf
"spark.driver.extraClassPath=$MAHOUT_JARS
spark.executor.extraClassPath=$MAHOUT_JARS"
target/scala-2.10/double-nystrom-method_2.10-1.0.jar

but with no success

I also notice that, the mahout code to add the mahout jars to the runtime
classpath IS inside the mahoutSparkContext call, so if no jar available to
spark executor to define mahoutSparkContext, how would we expect that to
work in the first place?


On Tue, Feb 2, 2016 at 3:56 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> oh. one more question. are you getting this exception on the front end or
> worker? Can you provide the stack trace?
>
> One strange thing is, the backend actually should not ever need the
> context, it is only the front end's thing. If you getting this at the
> backend, it probably means you are capturing some objects as closure
> attributes you do not realize you do -- including the context reference.
> Which should not happen. This is a very common Spark programming problem.
>
> if you getting it on the front end, then it is classpath problem in your
> driver application and has nothing to do with Mahout itself. make sure to
> observe transitive dependency rules for the front end.
>
> On Tue, Feb 2, 2016 at 12:53 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
>
> > this is strange. if you took over the context, added jars manually and it
> > still does not work, there's something wrong with spark i guess or
> > permissions or those other 1000 things that can go wrong on linux/spark
> > deployment.
> >
> > You can try to add any custom jar to your application to call it at
> > backend to test if it works at all.
> >
> > i guess you can always drop mahout jars into spark classpath on worker
> > nodes, as the most desperate measure.
> >
> > On Tue, Feb 2, 2016 at 9:10 AM, BahaaEddin AlAila <bahaelaila7@gmail.com
> >
> > wrote:
> >
> >> Thank you very much for your reply.
> >> As I mentioned earlier, I am using mahoutSparkContext, and MAHOUT_HOME
> is
> >> set to the correct mahout path.
> >> I also have tried setting up the context myself as I looked into the
> >> implementation of mahoutSparkContext and supplied the jars path
> manually.
> >> still the same error.
> >> I will try with spark 1.5 and report.
> >>
> >> Thank you very much again,
> >>
> >> Kind Regards,
> >> Bahaa
> >>
> >>
> >> On Tue, Feb 2, 2016 at 12:01 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> >> wrote:
> >>
> >> > Bahaa, first off, i don't think we have certified any of releases to
> run
> >> > with spar 1.6 (yet). I think spark 1.5 is the last known release to
> run
> >> > with 0.11 series.
> >> >
> >> > Second, if you use mahoutSparkContext() method to create context, it
> >> would
> >> > look for MAHOUT_HOME setup to add mahout binaries to the job. So the
> >> > reasons you may not getting it is perhaps you are not using the
> >> > mahoutCreateContext()?
> >> >
> >> > alternatively, you can create context yourself, but you need (1) make
> >> sure
> >> > it has enabled and configured Kryo serialization properly, and (2)
> have
> >> > added all necessary mahout jars on your own.
> >> >
> >> > -d
> >> >
> >> > On Tue, Feb 2, 2016 at 8:22 AM, BahaaEddin AlAila <
> >> bahaelaila7@gmail.com>
> >> > wrote:
> >> >
> >> > > Greetings mahout users,
> >> > >
> >> > > I have been trying to use mahout samsara as a library with
> >> scala/spark,
> >> > but
> >> > > I haven't been successful in doing so.
> >> > >
> >> > > I am running spark 1.6.0 binaries, didn't build it myself.
> >> > > However, I tried both readily available binaries on Apache mirrors,
> >> and
> >> > > cloning and compiling mahout's repo, but neither worked.
> >> > >
> >> > > I keep getting
> >> > >
> >> > > Exception in thread "main" java.lang.NoClassDefFoundError:
> >> > > org/apache/mahout/sparkbindings/SparkDistributedContext
> >> > >
> >> > > The way I am doing things is:
> >> > > I have spark in ~/spark-1.6
> >> > > and mahout in ~/mahout
> >> > > I have set both $SPARK_HOME and $MAHOUT_HOME accordingly, along with
> >> > > $MAHOUT_LOCAL=true
> >> > >
> >> > > and I have:
> >> > >
> >> > > ~/app1/build.sbt
> >> > > ~/app1/src/main/scala/App1.scala
> >> > >
> >> > > in build.sbt I have these lines to declare mahout dependecies:
> >> > >
> >> > > libraryDependencies += "org.apache.mahout" %% "mahout-math-scala"
%
> >> > > "0.11.1"
> >> > >
> >> > > libraryDependencies += "org.apache.mahout" % "mahout-math" %
> "0.11.1"
> >> > >
> >> > > libraryDependencies += "org.apache.mahout" % "mahout-spark_2.10" %
> >> > "0.11.1"
> >> > >
> >> > > along with other spark dependencies
> >> > >
> >> > > and in App1.scala, in the main function, I construct a context
> object
> >> > using
> >> > > mahoutSparkContext, and of course, the sparkbindings are imported
> >> > >
> >> > > everything compiles successfully
> >> > >
> >> > > however, when I submit to spark, I get the above mentioned error.
> >> > >
> >> > > I have a general idea of why this is happening: because the compiled
> >> app1
> >> > > jar depends on mahout-spark dependency jar but it cannot find it in
> >> the
> >> > > class path upon being submitted to spark.
> >> > >
> >> > > In the instructions I couldn't find how to explicitly add the
> >> > mahout-spark
> >> > > dependency jar to the class path.
> >> > >
> >> > > The question is: Am I doing the configurations correctly or not?
> >> > >
> >> > > Sorry for the lengthy email
> >> > >
> >> > > Kind Regards,
> >> > > Bahaa
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message