phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "孟庆义(孟庆义)" <qingyi....@alibaba-inc.com>
Subject HashJoin become slower or even fail due to use of ChunkedResultIterator
Date Fri, 29 Aug 2014 03:15:41 GMT
Dears: 

 

My use case is “ select * from A inner join B on xx where xx ”. A has
about 400m rows, but the result only has few rows.

 

Problem 1: before ChunkedResultIterator, SpoolingResultIterators will run in
parallel when they created in ParallelIterators. Now they work in a serial
way.

In my case, not using ChunkedResultIterator will get 5times faster. And it
not necessary to use chunked scan as the actually returned rows is few. 

 

Problem2 : as scan work in serial, it may cause some RS’s HashCache
out-of-date, and then fail the join. It happened in my case, and I fix it by
increase the timeout to be 60s(default is 30), but I think a worse case may
trigger it again in some future, 

It’s hard to determine how long is enough.

 

My solution is adding a config option to enable/disable
ChunkedResultIterator.

I’m looking forward for your advice.

 

Daniel.Meng

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message