spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 马阳阳 <ma_yang_y...@163.com>
Subject Why were changes of SPARK-9241 removed?
Date Fri, 13 Mar 2020 03:23:48 GMT
Hi,
I wonder why the changes made in
"[SPARK-9241][SQL] Supporting
multiple DISTINCT columns (2) -
Rewriting Rule" are not present in
Spark (verson 2.4) now. This caused
execution of count distinct in Spark
much slower than Spark 1.6 and hive
(Spark 2.4.4 more than 18 minutes;
hive about 80s, spark 1.6 about 3
minutes).




-- 
Sent from Postbox <https://www.postbox-inc.com>

Mime
View raw message