drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aman Sinha <amansi...@apache.org>
Subject Re: query plan ....
Date Tue, 25 Aug 2015 00:20:53 GMT
I was about to say that for IN lists of size 20 or more, Drill uses a more
efficient Values operator instead of OR conditions but then realized the OR
filter is referencing 4 different columns : $1..$4 and each of those
individual lists is less than 20.  Sungwook,  can you please provide the
SQL query and any view definitions or anything that goes with it ?  It is
difficult to figure out things without the full picture.
thanks,
Aman

On Mon, Aug 24, 2015 at 5:10 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> On Mon, Aug 24, 2015 at 4:50 PM, Sungwook Yoon <syoon@maprtech.com> wrote:
>
> > Still, the performance drop down due to OR filtering is just
> astounding...
> >
>
> That is what query optimizers are for and why getting them to work well is
> important.
>
> The difference in performance that you are observing is not surprising
> given the redundant work that you are seeing. Using the OR operator
> prevents any significant short-circuiting and the repeated conversion
> operations that are happening make the evaluation much more expensive than
> it would otherwise be (a dozen extra copies where only one is needed).
>
> Other queries that can be subject to similar problems include common table
> expressions that read the same (large) input file many times.  So far,
> Drill doesn't optimize all such expressions well.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message