spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Malouf <malouf.g...@gmail.com>
Subject Re: StackOverflow still after implementing custom serializers when working with large data set
Date Tue, 17 Sep 2013 16:38:28 GMT
If more context is needed, I am happy to provide it.  This is a very
troubling issue for us as it seriously limits how much data we can look at
a time in Spark.  For now, I am able to revert to Hive to get the job done..


On Fri, Sep 13, 2013 at 3:19 PM, Gary Malouf <malouf.gary@gmail.com> wrote:

> I previously was having issues with StackOverflows when working with one
> or two days worth of data.  Steps I have taken since then:
>
> 1) Increase stack size (Xss) from default to 2m to as high as 200m
> 2) Active Kryo serialization
> 3) Implement custom serializers for my protobuf messages
>
> While these changes have allowed me to grab up to 10 days worth of data, I
> can not really go beyond that without the dreaded StackOverflowError:
>
> java.lang.StackOverflowError
>     at
> java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2291)
>     at
> java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2584)
>     at
> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2594)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1316)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>
>
> Seems like it gets stuck in an infinite loop of deserialization.  Has
> anyone found ways to work through this?
>

Mime
View raw message