tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TINKERPOP-1341) UnshadedKryoAdapter fails to deserialize StarGraph when SparkConf sets spark.rdd.compress=true whereas GryoSerializer works
Date Fri, 01 Jul 2016 12:07:10 GMT

    [ https://issues.apache.org/jira/browse/TINKERPOP-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358859#comment-15358859
] 

ASF GitHub Bot commented on TINKERPOP-1341:
-------------------------------------------

Github user spmallette commented on the issue:

    https://github.com/apache/tinkerpop/pull/353
  
    VOTE +1


> UnshadedKryoAdapter fails to deserialize StarGraph when SparkConf sets spark.rdd.compress=true
whereas GryoSerializer works
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TINKERPOP-1341
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1341
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 3.2.1, 3.3.0
>            Reporter: Dylan Bethune-Waddell
>            Priority: Minor
>
> When trying to bulk load a large dataset into Titan I was running into OOM errors and
decided to try tweaking some spark configuration settings - although I am having trouble bulk
loading with the new GryoRegistrator/UnshadedKryo serialization shim stuff in master whereby
a few hundred tasks into the edge loading stage (stage 5) exceptions are thrown complaining
about the need to explicitly register CompactBuffer[].class with Kryo, this approach with
spark.rdd.compress=true fails a few hundred tasks into the vertex loading stage (stage 1)
of BulkLoaderVertexProgram. GryoSerializer instead of KryoSerializer with GryoRegistrator
does not fail and successfully loads the data with this compression flag flipped on whereas
before I would just get OOM errors until eventually the job was set back so far that it just
failed. So it would seem it is desirable in some instances to use this setting, and the new
Serialization stuff seems to break it. Could be a Spark upstream issue based on this open
JIRA ticket (https://issues.apache.org/jira/browse/SPARK-3630). Here is the exception that
is thrown with the middle bits cut out:
> com.esotericsoftware.kryo.KryoException: java.io.IOException: PARSING_ERROR(2)
>         at com.esotericsoftware.kryo.io.Input.fill(Input.java:142)
>         at com.esotericsoftware.kryo.io.Input.require(Input.java:169)
>         at com.esotericsoftware.kryo.io.Input.readLong_slow(Input.java:715)
>         at com.esotericsoftware.kryo.io.Input.readLong(Input.java:665)
>         at com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:113)
>         at com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:103)
>         at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>         at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readClassAndObject(UnshadedKryoAdapter.java:48)
>         at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readClassAndObject(UnshadedKryoAdapter.java:30)
>         at org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.readEdges(StarGraphSerializer.java:134)
>         at org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.read(StarGraphSerializer.java:91)
>         at org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.read(StarGraphSerializer.java:45)
>         at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedSerializerAdapter.read(UnshadedSerializerAdapter.java:55)
>         at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:626)
>         at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readObject(UnshadedKryoAdapter.java:42)
>         at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readObject(UnshadedKryoAdapter.java:30)
>         at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.VertexWritableSerializer.read(VertexWritableSerializer.java:46)
>         at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.VertexWritableSerializer.read(VertexWritableSerializer.java:36)
>         at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedSerializerAdapter.read(UnshadedSerializerAdapter.java:55)
>         at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>         at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:228)
> ........................................................ and so on .....................................
> Caused by: java.io.IOException: PARSING_ERROR(2)
>         at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:84)
>         at org.xerial.snappy.SnappyNative.uncompressedLength(Native Method)
>         at org.xerial.snappy.Snappy.uncompressedLength(Snappy.java:594)
>         at org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:358)
>         at org.xerial.snappy.SnappyInputStream.rawRead(SnappyInputStream.java:167)
>         at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:150)
>         at com.esotericsoftware.kryo.io.Input.fill(Input.java:140)
>         ... 51 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message