drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-19) Build a JSON scanner that does schema discovery
Date Thu, 24 Jan 2013 22:29:14 GMT

    [ https://issues.apache.org/jira/browse/DRILL-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562062#comment-13562062

Jacques Nadeau commented on DRILL-19:

I'd suggest using embedded schema instead.  Seems more fail-fast which is what I am kind of
striving for with this whole hetergenous data disaster.  We just define a standard "embedded
schema" message type.  The two fields are bytes and protobufdef.  And the repeated is of that
message type.  Basically "leave for another day"  Decode if you want.  I think it makes handling
decoding much simpler, especially if we're not actually interested in the value at this time.

> Build a JSON scanner that does schema discovery
> -----------------------------------------------
>                 Key: DRILL-19
>                 URL: https://issues.apache.org/jira/browse/DRILL-19
>             Project: Apache Drill
>          Issue Type: New Feature
>            Reporter: Jacques Nadeau
>            Assignee: Timothy Chen
>         Attachments: scan-json.patch
> Build a JSON scanner that reads a file and converts it into two parts: a stream of records
and a schema which reflects the schema of the records.  

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message