drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neeraja Rentachintala <nrentachint...@maprtech.com>
Subject Re: Bug or Feature?
Date Thu, 04 Feb 2016 16:26:31 GMT
John
What happens if you do the select query with no filter.

The scenario you explained does seem like an unexpected behavior.

-Neeraja

On Thu, Feb 4, 2016 at 8:21 AM, John Omernik <john@omernik.com> wrote:

> Prior to posting a JIRA, I thought I'd toss this here:
>
> If I have a directory: data with subdirectories with parquet files in it
>
>
> data/2016-01-01
> data/2016-01-02
>
> (Seem familiar? This came up in my other testing)
>
>
> If I have MORE then one subdirectory,
>
> then
>
> select count(1) from `data/` where dir0='2016-01-01'
>
>  Works fine.
>
> However, if I have EXACTLY one subdirectory, then
>
> select count(1) from `data/` where dir0 = '2016-01-01'
>
> Takes 15 seconds (instead of returning almost instantly) and reports 0
> records for count.
> Note, this directory DOES exists, so that is not the issue.
>
> If I add a second directory, then the exact query returns almost instantly,
> and reports the correct number of records.
>
> In addition, when there is only one directory, select count(1) from `data/`
> returns instant and the correct count.
>
> To me, it appears if there is ONE and only ONE subdirectory, then dir0=
>  doesn't work as I think people would expect it to. I can't think of a real
> reason to have this behave, and to me it violates the principle of "least
> surprise", but I am not up on the internals of Drill, so I thought I'd post
> here first.
>
> John
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message