drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [drill] paul-rogers opened a new pull request #1690: DRILL-7086: Output schema for row set mechanism
Date Tue, 12 Mar 2019 05:22:10 GMT
paul-rogers opened a new pull request #1690: DRILL-7086: Output schema for row set mechanism
URL: https://github.com/apache/drill/pull/1690
 
 
   Enhances the row set mechanism to take an "output schema" that describes the vectors to
create. The "input schema" describes the type that the reader would like to write. A conversion
mechanism inserts a conversion shim to convert from the input to output type.
   
   The "output schema" will be the one provided by the new schema mechanism: a later PR will
connect the mechanism here with the "output schema" provided in the physical plan.
   
   The output schema is optional. Within an output schema, the columns are optional. If not
output schema is provided, or no column matches a given input column, then the input schema
determines vector type as was the case before this enhancement.
   
   This PR includes several "starter" conversion classes. These are mostly prototypes, only
the string-to-int version has been fully tested. The date, time and date time versions show
the use of the new format property added to the column metadata class.
   
   A format or storage plugin can specify its own conversion rules. For example, the plugin
for HBase might provide byte-array-to-whatever converters.
   
   An odd aspect of the current implementation is that the type conversion is done at reader
open time. If the reader detects a column type which cannot be converted, using known rules,
to the desired output type, the query will fail. This is odd because one might expect this
error to be caught at plan time. But, Drill, of course, is schema-on-read, so read time is
when we'd detect the conflict. May not be a problem in real tables. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message