phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "孟庆义(孟庆义)" <>
Subject HashJoin become slower or even fail due to use of ChunkedResultIterator
Date Fri, 29 Aug 2014 03:15:41 GMT


My use case is “ select * from A inner join B on xx where xx ”. A has
about 400m rows, but the result only has few rows.


Problem 1: before ChunkedResultIterator, SpoolingResultIterators will run in
parallel when they created in ParallelIterators. Now they work in a serial

In my case, not using ChunkedResultIterator will get 5times faster. And it
not necessary to use chunked scan as the actually returned rows is few. 


Problem2 : as scan work in serial, it may cause some RS’s HashCache
out-of-date, and then fail the join. It happened in my case, and I fix it by
increase the timeout to be 60s(default is 30), but I think a worse case may
trigger it again in some future, 

It’s hard to determine how long is enough.


My solution is adding a config option to enable/disable

I’m looking forward for your advice.




  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message