spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xin Liu <>
Subject SizeEstimator
Date Tue, 27 Feb 2018 00:47:07 GMT
Hi folks,

We have a situation where, shuffled data is protobuf based, and
SizeEstimator is taking a lot of time.

We have tried to override SizeEstimator to return a constant value, which
speeds up things a lot.

My questions, what is the side effect of disabling SizeEstimator? Is it
just spark do memory reallocation, or there is more severe consequences?


View raw message