drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Barclay <dbarc...@maprtech.com>
Subject default to waiting for net/full schema? [was: Re: Desired Behavior when a table has both files and folders?]
Date Fri, 01 May 2015 01:08:21 GMT
Should Drill default to not sending changing schema information (that is,
to waiting until it has all the schema information before returning any
through JDBC), and only send changing schemas when the client has somehow
told Drill that it can handle changing schemas (e.g., when the client
registers a handler for schema changes, or in some connection property)?

Then Drill would work "normally" in regular JDBC tools (they won't fail
to show columns that didn't exist in earlier rows or--worse--crash trying
to access columns that no longer exist in later rows), but Drill could
still incrementally return changing schema information to clients that
can handle it.

Daniel


Steven Phillips wrote:
> I believe the missing columns is due to a limitation in sqlline itself. For
> this query, Drill don't know in advance what columns will be returned. It
> just returns them as they come. When the first batch get back to sqlline,
> it will assume that whatever columns it receives in that batch are the only
> columns this query will return. And it ignores any new columns that show up.
>
> On Wed, Apr 29, 2015 at 6:20 PM, Hao Zhu <hzhu@maprtech.com> wrote:
>
>> You can specify the column names.
>> "select *"  explores the schema by itself.
>>
>>> select * from `data`;
>> +------------+------------+
>> |    dir0    |    col1    |
>> +------------+------------+
>> | null       | 1          |
>> | folder1    | null       |
>> | folder1    | null       |
>> | folder1    | 4          |
>> +------------+------------+
>> 4 rows selected (0.074 seconds)
>>> select dir0,col1,col2 from `data`;
>> +------------+------------+------------+
>> |    dir0    |    col1    |    col2    |
>> +------------+------------+------------+
>> | null       | 1          | null       |
>> | folder1    | null       | 3          |
>> | folder1    | null       | 2          |
>> | folder1    | 4          | null       |
>> +------------+------------+------------+
>> 4 rows selected (0.088 seconds)
>>> select dir0,col1,col2,col3 from `data`;
>> +------------+------------+------------+------------+
>> |    dir0    |    col1    |    col2    |    col3    |
>> +------------+------------+------------+------------+
>> | null       | 1          | null       | null       |
>> | folder1    | null       | 3          | null       |
>> | folder1    | null       | 2          | null       |
>> | folder1    | 4          | null       | null       |
>> +------------+------------+------------+------------+
>> 4 rows selected (0.098 seconds)
>>
>> Thanks,
>> Hao
>>
>> On Wed, Apr 29, 2015 at 5:14 PM, rahul challapalli <
>> challapallirahul@gmail.com> wrote:
>>
>>> What is the desired behavior when I run "select * from data;" on the
>> below
>>> structure?
>>>
>>> data/
>>>    -- file1.json
>>>    -- folder1/
>>>         -- file2.json
>>>
>>> file1.json : {"col1" : 1}
>>> file2.json : {"col2" : 2}
>>>
>>> This is what drill returns :
>>> +------------+------------+
>>> |    dir0    |    col2    |
>>> +------------+------------+
>>> | folder1   | 2          |
>>> | null       | null       |
>>> +------------+------------+
>>>
>>> Looks like drill ignored the columns from the first file.
>>>
>>> - Rahul
>>>
>>
>
>
>


-- 
Daniel Barclay
MapR Technologies

Mime
View raw message