crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-216) Transpose arguments in MapsideJoinStrategy.join
Date Mon, 19 Aug 2013 16:10:50 GMT


Gabriel Reid commented on CRUNCH-216:

I have the feeling that it's better to stay away from trying to be too clever with that stuff.
I find that even when I remember to implement a decent scaleFactor method, it's still pretty
hit and miss with getting reliable sizes from the getSize method (i.e. it's just really hard
to do it correctly).

On the other hand, usually when you're using a MapSideJoin there is going to be a really big
difference in the size of collections being joined, so maybe it would be ok even if the size
heuristic isn't that reliable. 
> Transpose arguments in MapsideJoinStrategy.join
> -----------------------------------------------
>                 Key: CRUNCH-216
>                 URL:
>             Project: Crunch
>          Issue Type: Improvement
>            Reporter: Gabriel Reid
> The MapsideJoinStrategy currently specifies that the smaller table in the join (i.e.
the table to be replicated and loaded in memory) should be on the right-hand side of the join.
> This is the opposite of what is done in all other join strategies, making it impossible
to just switch out another join strategy for a MapsideJoinStrategy. The MapsideJoinStrategy
could be brought in line with the other JoinStrategies to expect the smaller of two tables
to be provided as the left-side table.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message