drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Igor Guzenko (Jira)" <j...@apache.org>
Subject [jira] [Resolved] (DRILL-6181) CTAS should support writing nested structures (nested lists) to parquet.
Date Sun, 01 Sep 2019 09:47:00 GMT

     [ https://issues.apache.org/jira/browse/DRILL-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Igor Guzenko resolved DRILL-6181.
---------------------------------
    Resolution: Fixed

Done in scope of DRILL-7326.



> CTAS should support writing nested structures (nested lists) to parquet.
> ------------------------------------------------------------------------
>
>                 Key: DRILL-6181
>                 URL: https://issues.apache.org/jira/browse/DRILL-6181
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 1.12.0
>            Reporter: Khurram Faraaz
>            Priority: Major
>
> Both Parquet and Hive support writing nested structures into parquet
> https://issues.apache.org/jira/browse/HIVE-8909
> https://issues.apache.org/jira/browse/PARQUET-113
> A CTAS from Drill fails when there is a nested list of lists, in one of the columns in
the project.
> JSON data used in the test, note that "arr" is a nested list of lists 
>  
> {noformat} 
> [root@qa102-45 ~]# cat jsonToParquet_02.json
> {"id":"123","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
> {"id":"3","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
> {"id":"13","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
> {"id":"12","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
> {"id":"2","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
> {"id":"1","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
> {"id":"230","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
> {"id":"1230","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
> {"id":"1123","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
> {"id":"2123","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
> {"id":"1523","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
> [root@qa102-45 ~]#
> {noformat}
> CTAS fails with UnsupportedOperationException on Drill 1.12.0-mapr commit id bb07ebbb9ba8742f44689f8bd8efb5853c5edea0
> {noformat}
>  0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_prq_from_json_02 as select id, arr
from `jsonToParquet_02.json`;
> Error: SYSTEM ERROR: UnsupportedOperationException: Unsupported type LIST
> Fragment 0:0
> [Error Id: 7e5b3c2d-9cf1-4e87-96c8-e7e7e8055ddf on qa102-45.qa.lab:31010] (state=,code=0)
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-02-22 09:56:54,368 [2570fb99-62da-a516-2c1f-0381e21723ae:frag:0:0] ERROR o.a.d.e.w.fragment.FragmentExecutor
- SYSTEM ERROR: UnsupportedOperationException: Unsupported type LIST
> Fragment 0:0
> [Error Id: 7e5b3c2d-9cf1-4e87-96c8-e7e7e8055ddf on qa102-45.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: UnsupportedOperationException:
Unsupported type LIST
> Fragment 0:0
> [Error Id: 7e5b3c2d-9cf1-4e87-96c8-e7e7e8055ddf on qa102-45.qa.lab:31010]
>  at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
~[drill-common-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:301)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.12.0-mapr.jar:1.12.0-mapr]
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_161]
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_161]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> Caused by: java.lang.UnsupportedOperationException: Unsupported type LIST
>  at org.apache.drill.exec.store.parquet.ParquetRecordWriter.getType(ParquetRecordWriter.java:253)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.store.parquet.ParquetRecordWriter.newSchema(ParquetRecordWriter.java:205)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.store.parquet.ParquetRecordWriter.updateSchema(ParquetRecordWriter.java:190)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.physical.impl.WriterRecordBatch.setupNewSchema(WriterRecordBatch.java:157)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:103)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:164)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:164)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:105) ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:79)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:95) ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:234)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:227)
~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_161]
>  at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_161]
>  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
~[hadoop-common-2.7.0-mapr-1707.jar:na]
>  at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:227)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
>  ... 4 common frames omitted
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Mime
View raw message