spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Omernik <j...@omernik.com>
Subject Re: Spark 1.3.1 On Mesos Issues.
Date Mon, 08 Jun 2015 19:29:10 GMT
It appears this may be related.

https://issues.apache.org/jira/browse/SPARK-1403

Granted the NPE is in MapR's code, having Spark (seemingly, I am not an
expert here, just basing it off the comments) switch in its behavior (if
that's what it is doing) probably isn't good either. I guess the level that
this is happening at is way above my head.  :)



On Fri, Jun 5, 2015 at 4:38 PM, John Omernik <john@omernik.com> wrote:

> Thanks all. The answers post is me too, I multi thread. That and Ted is
> aware to and Mapr is helping me with it.  I shall report the answer of that
> investigation when we have it.
>
> As to reproduction, I've installed mapr file system, tired both version
> 4.0.2 and 4.1.0.  Have mesos running along side mapr, and then I use
> standard methods for submitting spark jobs to mesos. I don't have my
> configs now, on vacation :) but I can shar on Monday.
>
> I appreciate the support I am getting from every one, mesos community,
> spark community, and mapr.  Great to see folks solving problems and I will
> be sure report back findings as they arise.
>
>
>
> On Friday, June 5, 2015, Tim Chen <tim@mesosphere.io> wrote:
>
>> It seems like there is another thread going on:
>>
>>
>> http://answers.mapr.com/questions/163353/spark-from-apache-downloads-site-for-mapr.html
>>
>> I'm not particularly sure why, seems like the problem is that getting the
>> current context class loader is returning null in this instance.
>>
>> Do you have some repro steps or config we can try this?
>>
>> Tim
>>
>> On Fri, Jun 5, 2015 at 3:40 AM, Steve Loughran <stevel@hortonworks.com>
>> wrote:
>>
>>>
>>>  On 2 Jun 2015, at 00:14, Dean Wampler <deanwampler@gmail.com> wrote:
>>>
>>>  It would be nice to see the code for MapR FS Java API, but my google
>>> foo failed me (assuming it's open source)...
>>>
>>>
>>>  I know that MapRFS is closed source, don't know about the java JAR.
>>> Why not ask Ted Dunning (cc'd)  nicely to see if he can track down the
>>> stack trace for you.
>>>
>>>   So, shooting in the dark ;) there are a few things I would check, if
>>> you haven't already:
>>>
>>>  1. Could there be 1.2 versions of some Spark jars that get picked up
>>> at run time (but apparently not in local mode) on one or more nodes? (Side
>>> question: Does your node experiment fail on all nodes?) Put another way,
>>> are the classpaths good for all JVM tasks?
>>> 2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos?
>>>
>>>  Incidentally, how are you combining Mesos and MapR? Are you running
>>> Spark in Mesos, but accessing data in MapR-FS?
>>>
>>>  Perhaps the MapR "shim" library doesn't support Spark 1.3.1.
>>>
>>>  HTH,
>>>
>>>  dean
>>>
>>>  Dean Wampler, Ph.D.
>>> Author: Programming Scala, 2nd Edition
>>> <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
>>> Typesafe <http://typesafe.com/>
>>> @deanwampler <http://twitter.com/deanwampler>
>>> http://polyglotprogramming.com
>>>
>>> On Mon, Jun 1, 2015 at 2:49 PM, John Omernik <john@omernik.com> wrote:
>>>
>>>> All -
>>>>
>>>>  I am facing and odd issue and I am not really sure where to go for
>>>> support at this point.  I am running MapR which complicates things as it
>>>> relates to Mesos, however this HAS worked in the past with no issues so I
>>>> am stumped here.
>>>>
>>>>  So for starters, here is what I am trying to run. This is a simple
>>>> show tables using the Hive Context:
>>>>
>>>>  from pyspark import SparkContext, SparkConf
>>>> from pyspark.sql import SQLContext, Row, HiveContext
>>>> sparkhc = HiveContext(sc)
>>>> test = sparkhc.sql("show tables")
>>>> for r in test.collect():
>>>>   print r
>>>>
>>>>  When I run it on 1.3.1 using ./bin/pyspark --master local  This works
>>>> with no issues.
>>>>
>>>>  When I run it using Mesos with all the settings configured (as they
>>>> had worked in the past) I get lost tasks and when I zoom in them, the error
>>>> that is being reported is below.  Basically it's a NullPointerException on
>>>> the com.mapr.fs.ShimLoader.  What's weird to me is is I took each instance
>>>> and compared both together, the class path, everything is exactly the same.
>>>> Yet running in local mode works, and running in mesos fails.  Also of note,
>>>> when the task is scheduled to run on the same node as when I run locally,
>>>> that fails too! (Baffling).
>>>>
>>>>  Ok, for comparison, how I configured Mesos was to download the mapr4
>>>> package from spark.apache.org.  Using the exact same configuration
>>>> file (except for changing the executor tgz from 1.2.0 to 1.3.1) from the
>>>> 1.2.0.  When I run this example with the mapr4 for 1.2.0 there is no issue
>>>> in Mesos, everything runs as intended. Using the same package for 1.3.1
>>>> then it fails.
>>>>
>>>>  (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails
>>>> as well).
>>>>
>>>>  So basically When I used 1.2.0 and followed a set of steps, it worked
>>>> on Mesos and 1.3.1 fails.  Since this is a "current" version of Spark, MapR
>>>> is supports 1.2.1 only.  (Still working on that).
>>>>
>>>>  I guess I am at a loss right now on why this would be happening, any
>>>> pointers on where I could look or what I could tweak would be greatly
>>>> appreciated. Additionally, if there is something I could specifically draw
>>>> to the attention of MapR on this problem please let me know, I am perplexed
>>>> on the change from 1.2.0 to 1.3.1.
>>>>
>>>>  Thank you,
>>>>
>>>>  John
>>>>
>>>>
>>>>
>>>>
>>>>  Full Error on 1.3.1 on Mesos:
>>>> 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity
>>>> 1060.3 MB java.lang.NullPointerException at
>>>> com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at
>>>> com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at
>>>> com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at
>>>> org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60)
>>>> at java.lang.Class.forName0(Native Method) at
>>>> java.lang.Class.forName(Class.java:274) at
>>>> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847)
>>>> at
>>>> org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062)
>>>> at
>>>> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272)
>>>> at
>>>> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224)
>>>> at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141)
>>>> at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at
>>>> org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at
>>>> org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98)
>>>> at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at
>>>> org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at
>>>> org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at
>>>> org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1959) at
>>>> org.apache.spark.storage.BlockManager.(BlockManager.scala:104) at
>>>> org.apache.spark.storage.BlockManager.(BlockManager.scala:179) at
>>>> org.apache.spark.SparkEnv$.create(SparkEnv.scala:310) at
>>>> org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:186) at
>>>> org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:70)
>>>> java.lang.RuntimeException: Failure loading MapRClient. at
>>>> com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:283) at
>>>> com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at
>>>> org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60)
>>>> at java.lang.Class.forName0(Native Method) at
>>>> java.lang.Class.forName(Class.java:274) at
>>>> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847)
>>>> at
>>>> org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062)
>>>> at
>>>> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272)
>>>> at
>>>> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224)
>>>> at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141)
>>>> at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at
>>>> org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at
>>>> org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98)
>>>> at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at
>>>> org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at
>>>> org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at
>>>> org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1959) at
>>>> org.apache.spark.storage.BlockManager.(BlockManager.scala:104) at
>>>> org.apache.spark.storage.BlockManager.(BlockManager.scala:179) at
>>>> org.apache.spark.SparkEnv$.create(SparkEnv.scala:310) at
>>>> org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:186) at
>>>> org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:70)
>>>>
>>>>
>>>>
>>>
>>>
>>
>
> --
> Sent from my iThing
>

Mime
View raw message