spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajkiran Rajkumar <>
Subject Kryo serializer slower than Java serializer for Spark Streaming
Date Thu, 06 Oct 2016 13:47:23 GMT
I am running a Spark Streaming application which reads from a Kinesis
stream and processes data. The application is run on EMR. Recently, we
tried moving from Java's inbuilt serializer to Kryo serializer. To quantify
the performance improvement, I tried pumping 30000 input records to the
application over a period of 5 minutes. Based on the task deserialization
time, I have the following data.
Using Java serializer- Median 3 ms, Mean 8.21 ms
Using Kryo serializer- Median 4 ms, Mean 9.64 ms

Here, we see that Kryo serializer is slower than Java serializer. Looking
for some advice regarding items that I might have missed taking into
account. Please let me know if more information is needed.


View raw message