drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abdel Hakim Deneche <adene...@maprtech.com>
Subject Re: Wrong results - windowing query over view
Date Wed, 24 Jun 2015 00:31:50 GMT
What happens if you run the queries on the original parquet files and not
the views ?

On Tue, Jun 23, 2015 at 5:28 PM, Khurram Faraaz <kfaraaz@maprtech.com>
wrote:

> Windowing query over a view returns wrong results when used with and
> without a group by clause. Please let me know if this is a planning bug ?
> Postgres does not support the query where we use a group by.
>
> DDL used for view creation was,
>
> create view vwOnParq (col_int, col_bigint, col_char_2, col_vchar_52,
> col_tmstmp, col_dt, col_booln, col_dbl, col_tm) as select col_int,
> col_bigint, col_char_2, col_vchar_52, col_tmstmp, col_dt, col_booln,
> col_dbl, col_tm from `tblForView/0_0_0.parquet`;
>
>
> The two queries are,
>
>
> 0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq;
>
> *+---------+*
>
> *| **EXPR$0 ** |*
>
> *+---------+*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *| *-19    * |*
>
> *+---------+*
>
> 30 rows selected (0.26 seconds)
>
>
> Explain plan for the above query
>
>
> *| *00-00    Screen
>
> 00-01      Project(EXPR$0=[$0])
>
> 00-02        Project($0=[$9])
>
> 00-03          Window(window#0=[window(partition {} order by [] range
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])
>
> 00-04            Project(col_int=[$4], col_bigint=[$7], col_char_2=[$2],
> col_vchar_52=[$1], col_tmstmp=[$0], col_dt=[$3], col_booln=[$6],
> col_dbl=[$8], col_tm=[$5])
>
> 00-05              Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblForView/0_0_0.parquet]],
> selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1,
> columns=[`col_int`, `col_bigint`, `col_char_2`, `col_vchar_52`,
> `col_tmstmp`, `col_dt`, `col_booln`, `col_dbl`, `col_tm`]]])
>
>
> 0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq
> group by col_char_2;
>
> *+---------+*
>
> *| **EXPR$0 ** |*
>
> *+---------+*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *| *AZ     * |*
>
> *+---------+*
>
> 18 rows selected (0.27 seconds)
>
>
> Explain plan for the above query that uses group by
>
>
> *| *00-00    Screen
>
> 00-01      Project(EXPR$0=[$0])
>
> 00-02        Project($0=[$2])
>
> 00-03          Window(window#0=[window(partition {} order by [] range
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])
>
> 00-04            HashAgg(group=[{0}], agg#0=[MIN($1)])
>
> 00-05              Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblForView/0_0_0.parquet]],
> selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1,
> columns=[`col_char_2`, `col_int`]]])
>
>
> Thanks,
>
> Khurram
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message