spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Rosen <>
Subject Re: Shuffle write increases in spark 1.2
Date Sun, 04 Jan 2015 21:14:49 GMT
If you have a small reproduction for this issue, can you open a ticket at ?

On December 29, 2014 at 7:10:02 PM, Kevin Jung ( wrote:

Hi all,  
The size of shuffle write showing in spark web UI is mush different when I  
execute same spark job on same input data(100GB) in both spark 1.1 and spark  
At the same sortBy stage, the size of shuffle write is 39.7GB in spark 1.1  
but 91.0GB in spark 1.2.  
I set spark.shuffle.manager option to hash because it's default value is  
changed but spark 1.2 writes larger file than spark 1.1.  
Can anyone tell me why this happened?  


View this message in context:
Sent from the Apache Spark User List mailing list archive at  

To unsubscribe, e-mail:  
For additional commands, e-mail:  

View raw message