drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prabhakar Bhosaale <bhosale....@gmail.com>
Subject Re: Json Schma change error
Date Sun, 26 Sep 2021 12:49:04 GMT
Hi Luoc,
THanks for the inputs but unfortunately I am not a java developer and
understanding logic by looking at the code would be difficult for me. So
just want to know if there is any documentation available on this topic?
Thanks in advance

Regards
Prabhakar

On Sat, Sep 25, 2021 at 4:21 PM luoc <luoc@apache.org> wrote:

> Hello Prabhakar,
>
>   Good questions. I think you want to understand the internal logic for
> the JSON loader.
> Actually, the Drill includes two kind of JSON loader, old version and new
> revision.
> I suggest you to debug based on the following code base :
>
> 1. Test Unit with the new JSON loader :
>
> /drill-java-exec/src/test/java/org/apache/drill/exec/store/easy/json/loader
>
> 2. New JSON loader in HTTP storage :
>
> https://github.com/apache/drill/blob/bf2b0d79e43bf65448557510a7b39f17c428df78/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpBatchReader.java#L103
> <
> https://github.com/apache/drill/blob/bf2b0d79e43bf65448557510a7b39f17c428df78/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpBatchReader.java#L103
> >
>
> 3. JSON Record Reader :
>         org.apache.drill.exec.store.json.TestJsonRecordReader
>
>
> > 2021年9月24日 下午7:36,Prabhakar Bhosaale <bhosale.p.v@gmail.com> 写道:
> >
> > Hi Team,
> > I am getting following error while querying JSON file
> >
> > "(java.lang.Exception) UNSUPPORTED_OPERATION ERROR: Schema changes not
> > supported in External Sort."
> >
> > I have identified the root cause as one of the column has NULL value for
> > certain rows and STRING value for certain rows
> >
> > I am trying to find out How drill decides the datatype of columns and
> > identify the schema changes.
> >
> > I tried following changes in data and got different results.
> >
> > Case 1  - If I remove order by clause in the query then I don't the
> error.
> > Point to note, this specific column is not part of order by clause. But
> it
> > is part of select list
> >
> > Case 2 - If I keep only two rows in file, one with NULL data and other
> with
> > STRING data for given column then no error. Query returns the data
> > successfully
> >
> > Case 3 - I change the value of given column in first to from NULL to
> empty
> > string that is two double quotes then no error
> >
> > Previously somewhere I has read that drill reads initial certain rows of
> > JSON and decides the datatype but not able to find the same now in the
> > documentation.
> >
> > Thanks and Regards
> > Prabhakar
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message