spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Costin Leau <costin.l...@gmail.com>
Subject Re: Re: Classpath hell and Elasticsearch 2.3.2...
Date Fri, 03 Jun 2016 09:54:31 GMT
Hi,

Sorry to hear about your troubles. Not sure whether you are aware of the ES-Hadoop docs [1].
I've raised an issue [2] to
better clarify the usage of elasticsearch-hadoop vs elasticsearch-spark jars.

Apologies for the delayed response, for ES-Hadoop questions/issues it's best to use the dedicated
forum namely
https://discuss.elastic.co/c/elasticsearch-and-hadoop (see [3]).

Hope this helps,

[1] https://www.elastic.co/guide/en/elasticsearch/hadoop/2.3/spark.html
[2] https://github.com/elastic/elasticsearch-hadoop/issues/780
[3] https://www.elastic.co/guide/en/elasticsearch/hadoop/master/troubleshooting.html#help


On 6/3/16 2:06 AM, Kevin Burton wrote:
> Yeah.. thanks Nick. Figured that out since your last email... I deletedthe 2.10 by accident
but then put 2+2 together.
>
> Got it working now.
>
> Still sticking to my story that it's somewhat complicated to setup :)
>
> Kevin
>
> On Thu, Jun 2, 2016 at 3:59 PM, Nick Pentreath <nick.pentreath@gmail.com <mailto:nick.pentreath@gmail.com>>
wrote:
>
>     Which Scala version is Spark built against? I'd guess it's 2.10 since you're using
spark-1.6, and you're using the
>     2.11 jar for es-hadoop.
>
>
>     On Thu, 2 Jun 2016 at 15:50 Kevin Burton <burton@spinn3r.com <mailto:burton@spinn3r.com>>
wrote:
>
>         Thanks.
>
>         I'm trying to run it in a standalone cluster with an existing /large 100 node
ES install.
>
>         I'm using the standard 1.6.1 -2.6 distribution with elasticsearch-hadoop-2.3.2...
>
>         I *think* I'm only supposed to use the elasticsearch-spark_2.11-2.3.2.jar with
it...
>
>         but now I get the following exception:
>
>
>         java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
>         at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:52)
>         at org.elasticsearch.spark.package$SparkRDDFunctions.saveToEs(package.scala:37)
>         at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40)
>         at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:45)
>         at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:47)
>         at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:49)
>         at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:51)
>         at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:53)
>         at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:55)
>         at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:57)
>         at $iwC$$iwC$$iwC$$iwC.<init>(<console>:59)
>         at $iwC$$iwC$$iwC.<init>(<console>:61)
>         at $iwC$$iwC.<init>(<console>:63)
>         at $iwC.<init>(<console>:65)
>         at <init>(<console>:67)
>         at .<init>(<console>:71)
>         at .<clinit>(<console>)
>         at .<init>(<console>:7)
>         at .<clinit>(<console>)
>         at $print(<console>)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
>         at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
>         at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>         at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
>         at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
>         at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>         at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>         at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:875)
>         at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>         at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>         at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
>         at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
>         at org.apache.spark.repl.SparkILoop.org
>         <http://org.apache.spark.repl.SparkILoop.org>$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
>         at
>         org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
>         at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>         at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>         at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>         at org.apache.spark.repl.SparkILoop.org
>         <http://org.apache.spark.repl.SparkILoop.org>$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
>         at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
>         at org.apache.spark.repl.Main$.main(Main.scala:31)
>         at org.apache.spark.repl.Main.main(Main.scala)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
>         at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
>         at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
>         On Thu, Jun 2, 2016 at 3:45 PM, Nick Pentreath <nick.pentreath@gmail.com <mailto:nick.pentreath@gmail.com>>
wrote:
>
>             Hey there
>
>             When I used es-hadoop, I just pulled in the dependency intomy pom.xml, with
spark as a "provided"
>             dependency, and built a fat jar with assembly.
>
>             Then with spark-submit use the --jars option to include your assembly jar
(IIRC I sometimes also needed to
>             use --driver-classpath too, but perhaps not with recent Spark versions).
>
>
>
>             On Thu, 2 Jun 2016 at 15:34 Kevin Burton <burton@spinn3r.com <mailto:burton@spinn3r.com>>
wrote:
>
>                 I'm trying to get spark 1.6.1 to work with 2.3.2... needless to say it's
not super easy.
>
>                 I wish there was an easier way to get this stuff to work.. Last time
I tried to use spark more I was
>                 having similar problems with classpath setup and Cassandra.
>
>                 Seems a huge opportunity to make this easier for new developers.  This
stuff isn't rocket science but it
>                 can (needlessly) waste a ton of time.
>
>                 ... anyway... I'm have since figured out I have to specific *specific*
jars from the
>                 elasticsearch-hadoop distribution and use those.
>
>                 Right now I'm using :
>
>                 SPARK_CLASSPATH=/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-2.3.2.jar:/usr/share/elasticsearch-hadoop/lib/elasticsearch-spark_2.11-2.3.2.jar:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-mr-2.3.2.jar:/usr/share/apache-spark/lib/*
>
>                 ... but I"m getting:
>
>                 java.lang.NoClassDefFoundError: Could not initialize class org.elasticsearch.hadoop.util.Version
>                 at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:376)
>                 at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
>                 at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
>                 at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
>                 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>                 at org.apache.spark.scheduler.Task.run(Task.scala:89)
>                 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>                 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>                 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
>                 ... but I think its caused by this:
>
>                 16/06/03 00:26:48 WARN TaskSetManager: Lost task 0.0 instage 0.0 (TID
0, localhost): java.lang.Error:
>                 Multiple ES-Hadoop versions detected in the classpath; please use only
one
>                 jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-2.3.2.jar
>                 jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-spark_2.11-2.3.2.jar
>                 jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-mr-2.3.2.jar
>
>                 at org.elasticsearch.hadoop.util.Version.<clinit>(Version.java:73)
>                 at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:376)
>                 at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
>                 at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
>                 at org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
>                 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>                 at org.apache.spark.scheduler.Task.run(Task.scala:89)
>                 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>                 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>                 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>                 at java.lang.Thread.run(Thread.java:745)
>
>                 .. still tracking this down but was wondering if there is someting obvious
I'm dong wrong.  I'm going to
>                 take out elasticsearch-hadoop-2.3.2.jar and try again.
>
>                 Lots of trial and error here :-/
>
>                 Kevin
>
>                 --
>
>                 We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!
>
>                 Founder/CEO Spinn3r.com <http://Spinn3r.com>
>                 Location: *San Francisco, CA*
>                 blog:**http://burtonator.wordpress.com
>                 … or check out my Google+ profile <https://plus.google.com/102718274791889610666/posts>
>
>
>
>
>         --
>
>         We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!
>
>         Founder/CEO Spinn3r.com <http://Spinn3r.com>
>         Location: *San Francisco, CA*
>         blog:**http://burtonator.wordpress.com
>         … or check out my Google+ profile <https://plus.google.com/102718274791889610666/posts>
>
>
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!
>
> Founder/CEO Spinn3r.com <http://Spinn3r.com>
> Location: *San Francisco, CA*
> blog:**http://burtonator.wordpress.com
> … or check out my Google+ profile <https://plus.google.com/102718274791889610666/posts>
>

-- 
Costin




-- 
Costin


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message