spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From phoenix bai <mingzhi...@gmail.com>
Subject spark on yarn-standalone, throws StackOverflowError and fails somtimes and succeed for the rest
Date Fri, 09 May 2014 10:45:58 GMT
Hi all,

My spark code is running on yarn-standalone.

the last three lines of the code as below,

    val result = model.predict(prdctpairs)
    result.map(x =>
x.user+","+x.product+","+x.rating).saveAsTextFile(output)
    sc.stop()

the same code, sometimes be able to run successfully and could give out the
right result, while from time to time, it throws StackOverflowError and
fail.

and  I don`t have a clue how I should debug.

below is the error, (the start and end portion to be exact):


14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17]
MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
44 to spark@rxxxxxx43.mc10.site.net:43885
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17]
MapOutputTrackerMaster: Size of output statuses for shuffle 44 is 148 bytes
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-35]
MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
45 to spark@rxxxxxx43.mc10.site.net:43885
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-35]
MapOutputTrackerMaster: Size of output statuses for shuffle 45 is 453 bytes
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-20]
MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
44 to spark@rxxxxxx43.mc10.site.net:56767
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29]
MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
45 to spark@rxxxxxx43.mc10.site.net:56767
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29]
MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
44 to spark@rxxxxxx43.mc10.site.net:49879
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29]
MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
45 to spark@rxxxxxx43.mc10.site.net:49879
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17]
TaskSetManager: Starting task 946.0:17 as TID 146 on executor 6:
rxxxxx15.mc10.site.net (PROCESS_LOCAL)
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17]
TaskSetManager: Serialized task 946.0:17 as 6414 bytes in 0 ms
14-05-09 17:55:51 WARN [Result resolver thread-0] TaskSetManager: Lost TID
133 (task 946.0:4)
14-05-09 17:55:51 WARN [Result resolver thread-0] TaskSetManager: Loss was
due to java.lang.StackOverflowError
java.lang.StackOverflowError
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)

............................................

at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5]
TaskSetManager: Starting task 946.0:4 as TID 147 on executor 6:
rxxxx15.mc10.site.net (PROCESS_LOCAL)
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5]
TaskSetManager: Serialized task 946.0:4 as 6414 bytes in 0 ms
14-05-09 17:55:51 WARN [Result resolver thread-1] TaskSetManager: Lost TID
139 (task 946.0:10)
14-05-09 17:55:51 INFO [Result resolver thread-1] TaskSetManager: Loss was
due to java.lang.StackOverflowError [duplicate 1]
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5]
CoarseGrainedSchedulerBackend: Executor 4 disconnected, so removing it
14-05-09 17:55:51 ERROR [spark-akka.actor.default-dispatcher-5]
YarnClusterScheduler: Lost executor 4 on rxxxxx01.mc10.site.net: remote
Akka client disassociated
14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5]
TaskSetManager: Re-queueing tasks for 4 from TaskSet 992.0

did anyone have a similar issue?
Or anyone could provide a clue about where I should start looking?

thanks in advance!

Mime
View raw message