spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 吴晓菊 <chrysan...@gmail.com>
Subject Re: why BroadcastHashJoinExec is not implemented with outputOrdering?
Date Thu, 28 Jun 2018 14:07:44 GMT
Why we cannot use the output order of big table?


Chrysan Wu
Phone:+86 17717640807


2018-06-28 21:48 GMT+08:00 Marco Gaido <marcogaido91@gmail.com>:

> The easy answer to this is that SortMergeJoin ensure an outputOrdering,
> while BroadcastHashJoin doesn't, ie. after running a BroadcastHashJoin you
> don't know which is going to be the order of the output since nothing
> enforces it.
>
> Hope this helps.
> Thanks.
> Marco
>
> 2018-06-28 15:46 GMT+02:00 吴晓菊 <chrysanxia@gmail.com>:
>
>>
>> We see SortMergeJoinExec is implemented with
>> outputPartitioning&outputOrdering while BroadcastHashJoinExec is only
>> implemented with outputPartitioning. Why is the design?
>>
>> Chrysan Wu
>> Phone:+86 17717640807
>>
>>
>

Mime
View raw message