Aaditya Ramesh created SPARK-19525:
--------------------------------------
Summary: Enable Compression of Spark Streaming Checkpoints
Key: SPARK-19525
URL: https://issues.apache.org/jira/browse/SPARK-19525
Project: Spark
Issue Type: Improvement
Components: Structured Streaming
Affects Versions: 2.1.0
Reporter: Aaditya Ramesh
In our testing, compressing partitions while writing them to checkpoints on HDFS using snappy
helped performance significantly while also reducing the variability of the checkpointing
operation. In our tests, checkpointing time was reduced by 3X, and variability was reduced
by 2X for data sets of compressed size approximately 1 GB.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org
|