drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [drill] cgivre commented on pull request #2283: DRILL-7979: Self-Closing XML Tags Cause Schema Change Exceptions
Date Mon, 02 Aug 2021 22:31:24 GMT

cgivre commented on pull request #2283:
URL: https://github.com/apache/drill/pull/2283#issuecomment-891377214


   > Yes, I think I did grok the unknown schema problem. The thought above, which somehow
escaped all the striking out I did to it after thinking a bit more, was to take advantage
of the fact that scalar string can be embedded into a single element map. The tuple generating
code would need to become aware when it should do this.
   > 
   > My second comment's comparison of the situation with a JSON property that is first
null, then an object, is also a bit dubious because empty XML elements do not represent nulls
(from I what read) so much as zero length strings.
   > 
   > If there is an effort to make querying XML behave in a more similar way to querying
equivalent JSON, for some definition of equivalent, it should probably wait for another PR.
   
   I think you're right about that.  From what I remember, there is an option for Drill's
JSON parser to treat `NaN` and something else as `null`.   For XML I don't know how you'd
distinguish between an empty string and `null`.  
   
   This was also an issue with some data I was working on.  The JSON version used empty strings
to denote `null` then subsequent rows would contain maps which would cause SchemaChange exceptions.
 The only way to fix that was to use the `UNION` data type.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message