drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aman Sinha <amansi...@apache.org>
Subject Re: query plan ....
Date Mon, 24 Aug 2015 21:10:41 GMT
Indeed, it is not efficient. We are doing 16 invocations of
CONVERT_FROMUTF8($1)  and 16 invocations of CONVERT_FROMUTF8($2).
Can you pls file a JIRA ?  We should ideally be doing projection pushdown
in conjunction with the filter pushdown in to the HBase scan and
computing these functions only once.

Aman


On Mon, Aug 24, 2015 at 1:34 PM, Sungwook Yoon <syoon@maprtech.com> wrote:

> Hi,
>
> I have a query, doing something like
>
> a in (v1, v2, v3, .... v15)
>
> The physical query plan looks like the following.
>
> Filter(condition=[AND(>=(CAST($0):INTEGER, 2009), <=(CAST($0):INTEGER,
> 2013), OR(=(CONVERT_FROMUTF8($1), '39891'), =(CONVERT_FROMUTF8($1),
> '4280'), =(CONVERT_FROMUTF8($1), '4281'), =(CONVERT_FROMUTF8($1), '42820'),
> =(CONVERT_FROMUTF8($1), '42821'), =(CONVERT_FROMUTF8($1), '42822'),
> =(CONVERT_FROMUTF8($1), '42823'), =(CONVERT_FROMUTF8($1), '42830'),
> =(CONVERT_FROMUTF8($1), '42831'), =(CONVERT_FROMUTF8($1), '42832'),
> =(CONVERT_FROMUTF8($1), '42833'), =(CONVERT_FROMUTF8($1), '42840'),
> =(CONVERT_FROMUTF8($1), '42841'), =(CONVERT_FROMUTF8($1), '42842'),
> =(CONVERT_FROMUTF8($1), '42843'), =(CONVERT_FROMUTF8($1), '4289'),
> =(CONVERT_FROMUTF8($2), '39891'), =(CONVERT_FROMUTF8($2), '4280'),
> =(CONVERT_FROMUTF8($2), '4281'), =(CONVERT_FROMUTF8($2), '42820'),
> =(CONVERT_FROMUTF8($2), '42821'), =(CONVERT_FROMUTF8($2), '42822'),
> =(CONVERT_FROMUTF8($2), '42823'), =(CONVERT_FROMUTF8($2), '42830'),
> =(CONVERT_FROMUTF8($2), '42831'), =(CONVERT_FROMUTF8($2), '42832'),
> =(CONVERT_FROMUTF8($2), '42833'), =(CONVERT_FROMUTF8($2), '42840'),
> =(CONVERT_FROMUTF8($2), '42841'), =(CONVERT_FROMUTF8($2), '42842'),
> =(CONVERT_FROMUTF8($2), '42843'), =(CONVERT_FROMUTF8($2), '4289')))]) :
> rowType = RecordType(ANY year, ANY DX1, ANY DX2): rowcount =
> 3.300738791875E8, cumulative cost = {1.0562364134E10 rows,
> 5.413211618675E10 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 7136
>
>
> In this plan, does the drill convert to string the same column to multiple
> times as many as the values it is comparing against?
>
> From the performance, it looks like it is doing that ...
>
> Sungwook
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message