spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shady Xu <shad...@gmail.com>
Subject Re: OOM when running Spark SQL by PySpark on Java 8
Date Thu, 13 Oct 2016 09:42:08 GMT
All nodes of my YARN cluster is running on Java 7, but I submit the job
from a Java 8 client.

I realised I run the job in yarn cluster mode and that's why setting '
--driver-java-options' is effective. Now the problem is, why submitting a
job from a Java 8 client to a Java 7 cluster causes a PermGen OOM.

2016-10-13 17:30 GMT+08:00 Sean Owen <sowen@cloudera.com>:

> You can specify it; it just doesn't do anything but cause a warning in
> Java 8. It won't work in general to have such a tiny PermGen. If it's
> working it means you're on Java 8 because it's ignored. You should set
> MaxPermSize if anything, not PermSize. However the error indicates you are
> not using Java 8 everywhere on your cluster, and that's a potentially
> bigger problem.
>
> On Thu, Oct 13, 2016 at 10:26 AM Shady Xu <shadyxu@gmail.com> wrote:
>
>> Solved the problem by specifying the PermGen size when submitting the job
>> (even to just a few MB).
>>
>> Seems Java 8 has removed the Permanent Generation space, thus
>> corresponding JVM arguments are ignored.  But I can still
>> use --driver-java-options "-XX:PermSize=80M -XX:MaxPermSize=100m" to
>> specify them when submitting the Spark job, which is wried. I don't know
>> whether it has anything to do with py4j as I am not familiar with it.
>>
>> 2016-10-13 17:00 GMT+08:00 Shady Xu <shadyxu@gmail.com>:
>>
>> Hi,
>>
>> I have a problem when running Spark SQL by PySpark on Java 8. Below is
>> the log.
>>
>>
>> 16/10/13 16:46:40 INFO spark.SparkContext: Starting job: sql at NativeMethodAccessorImpl.java:-2
>> Exception in thread "dag-scheduler-event-loop" java.lang.OutOfMemoryError: PermGen
space
>> 	at java.lang.ClassLoader.defineClass1(Native Method)
>> 	at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
>> 	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>> 	at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
>> 	at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
>> 	at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>> 	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>> 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>> 	at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:857)
>> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1630)
>> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622)
>> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611)
>> 	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>> Exception in thread "shuffle-server-2" java.lang.OutOfMemoryError: PermGen space
>> Exception in thread "shuffle-server-4" java.lang.OutOfMemoryError: PermGen space
>> Exception in thread "threadDeathWatcher-2-1" java.lang.OutOfMemoryError: PermGen
space
>>
>>
>> I tried to increase the driver memory and didn't help. However, things are ok when
I run the same code after switching to Java 7. I also find it ok to run the SparkPi example
on Java 8. So I believe the problem stays with PySpark rather theSpark core.
>>
>>
>> I am using Spark 2.0.1 and run the program in YARN cluster mode. Anyone any idea
is appreciated.
>>
>>
>>

Mime
View raw message