crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 陈竞 <>
Subject Re: confused about the MapsideJoinStrategy, why use LoadLeftSideMapsideJoinStrategy, what if left table is too large to store in memory?
Date Wed, 11 May 2016 01:42:27 GMT
mapsideJoinStrategy.create()  use LoadLeftSideMapsideJoinStrategy, i'm just
confused why LoadLeftSideMapsideJoinStrategy is better than default

according to the annotation, LoadLeftSideMapsideJoinStrategy peforms better
than default strategy, but i don't know why

2016-05-10 11:30 GMT+08:00 David Ortiz <>:

> Try mapsideJoinStrategy.create()
> On Mon, May 9, 2016, 9:29 PM 陈竞 <> wrote:
>> hi, i'm very confused when i use MapsideJoinStrategy. the origin
>> constructor was deprecated, instead, LoadLeftSideMapsideJoinStrategy was
>> recommended, the main improvement is that load left side table in memory,
>> whose size is large than right side. however, when i want to use mas side
>> join, the left side table usually is too large to store in memory.
>> for example i have to table A and B, we need A left join B, and
>> size(A)>>size(B), naturally we want to use map side join, and use A as left
>> side, B as right side, then load B in memory to process, it's very simple.
>> However, if we use LoadLeftSideMapsideJoinStrategy, we use A as right side,
>> B as left side, which makes no improvement while adding a reverse DoFn
>> --
>> 陈竞,中科院计算技术研究所,高性能计算机中心
>> Jing Chen HPCC.ICT.AC China

Jing Chen HPCC.ICT.AC China

View raw message