spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From phoenix bai <mingzhi...@gmail.com>
Subject Re: spark on yarn-standalone, throws StackOverflowError and fails somtimes and succeed for the rest
Date Fri, 09 May 2014 15:20:48 GMT
after a couple of tests, I find that, if I use:

val result = model.predict(prdctpairs)
    result.map(x =>
x.user+","+x.product+","+x.rating).saveAsTextFile(output)

it always fails with above error and the exception seems iterative.

but if I do:

val result = model.predict(prdctpairs)
result.cach()
    result.map(x =>
x.user+","+x.product+","+x.rating).saveAsTextFile(output)

it succeeds.

could anyone help explain why the cach() is necessary?

thanks



On Fri, May 9, 2014 at 6:45 PM, phoenix bai <mingzhibai@gmail.com> wrote:

> Hi all,
>
> My spark code is running on yarn-standalone.
>
> the last three lines of the code as below,
>
>     val result = model.predict(prdctpairs)
>     result.map(x =>
> x.user+","+x.product+","+x.rating).saveAsTextFile(output)
>     sc.stop()
>
> the same code, sometimes be able to run successfully and could give out
> the right result, while from time to time, it throws StackOverflowError and
> fail.
>
> and  I don`t have a clue how I should debug.
>
> below is the error, (the start and end portion to be exact):
>
>
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17]
> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
> 44 to spark@rxxxxxx43.mc10.site.net:43885
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17]
> MapOutputTrackerMaster: Size of output statuses for shuffle 44 is 148 bytes
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-35]
> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
> 45 to spark@rxxxxxx43.mc10.site.net:43885
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-35]
> MapOutputTrackerMaster: Size of output statuses for shuffle 45 is 453 bytes
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-20]
> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
> 44 to spark@rxxxxxx43.mc10.site.net:56767
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29]
> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
> 45 to spark@rxxxxxx43.mc10.site.net:56767
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29]
> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
> 44 to spark@rxxxxxx43.mc10.site.net:49879
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-29]
> MapOutputTrackerMasterActor: Asked to send map output locations for shuffle
> 45 to spark@rxxxxxx43.mc10.site.net:49879
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17]
> TaskSetManager: Starting task 946.0:17 as TID 146 on executor 6:
> rxxxxx15.mc10.site.net (PROCESS_LOCAL)
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-17]
> TaskSetManager: Serialized task 946.0:17 as 6414 bytes in 0 ms
> 14-05-09 17:55:51 WARN [Result resolver thread-0] TaskSetManager: Lost TID
> 133 (task 946.0:4)
> 14-05-09 17:55:51 WARN [Result resolver thread-0] TaskSetManager: Loss was
> due to java.lang.StackOverflowError
> java.lang.StackOverflowError
> at java.lang.ClassLoader.defineClass1(Native Method)
>  at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
>  at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>  at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>  at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> at java.lang.ClassLoader.defineClass1(Native Method)
>  at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
>
> ............................................
>
> at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  at java.lang.reflect.Method.invoke(Method.java:597)
> at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969)
>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5]
> TaskSetManager: Starting task 946.0:4 as TID 147 on executor 6:
> rxxxx15.mc10.site.net (PROCESS_LOCAL)
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5]
> TaskSetManager: Serialized task 946.0:4 as 6414 bytes in 0 ms
> 14-05-09 17:55:51 WARN [Result resolver thread-1] TaskSetManager: Lost TID
> 139 (task 946.0:10)
> 14-05-09 17:55:51 INFO [Result resolver thread-1] TaskSetManager: Loss was
> due to java.lang.StackOverflowError [duplicate 1]
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5]
> CoarseGrainedSchedulerBackend: Executor 4 disconnected, so removing it
> 14-05-09 17:55:51 ERROR [spark-akka.actor.default-dispatcher-5]
> YarnClusterScheduler: Lost executor 4 on rxxxxx01.mc10.site.net: remote
> Akka client disassociated
> 14-05-09 17:55:51 INFO [spark-akka.actor.default-dispatcher-5]
> TaskSetManager: Re-queueing tasks for 4 from TaskSet 992.0
>
> did anyone have a similar issue?
> Or anyone could provide a clue about where I should start looking?
>
> thanks in advance!
>
>
>
>
>
>

Mime
View raw message