hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aniket Mokashi (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (HIVE-17448) ArrayIndexOutOfBoundsException on ORC tables after adding a struct field
Date Mon, 08 Oct 2018 22:50:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-17448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aniket Mokashi reassigned HIVE-17448:
-------------------------------------

    Assignee: Aniket Mokashi

> ArrayIndexOutOfBoundsException on ORC tables after adding a struct field
> ------------------------------------------------------------------------
>
>                 Key: HIVE-17448
>                 URL: https://issues.apache.org/jira/browse/HIVE-17448
>             Project: Hive
>          Issue Type: Bug
>          Components: ORC
>    Affects Versions: 2.1.1
>         Environment: Reproduced on Dataproc 1.1, 1.2 (Hive 2.1).
>            Reporter: Nikolay Sokolov
>            Assignee: Aniket Mokashi
>            Priority: Minor
>         Attachments: HIVE-17448.1-branch-2.1.patch
>
>
> When ORC files have been created with older schema, which had smaller set of struct fields,
and schema have been changed to one with more struct fields, and there are sibling fields
of struct type going after struct itself, ArrayIndexOutOfBoundsException is being thrown.
Steps to reproduce:
> {code:none}
> create external table test_broken_struct(a struct<f1:int, f2:int>, b int) stored
as orc;
> insert into table test_broken_struct 
>     select named_struct("f1", 1, "f2", 2), 3;
> drop table test_broken_struct;
> create external table test_broken_struct(a struct<f1:int, f2:int, f3:int>, b int)
stored as orc;
> select * from test_broken_struct;
> {code}
> Same scenario is not causing crash on hive 0.14.
> Debug log and stack trace:
> {code:none}
> 2017-09-07T00:21:40,266  INFO [main] orc.OrcInputFormat: Using schema evolution configuration
variables schema.evol
> ution.columns [a, b] / schema.evolution.columns.types [struct<f1:int,f2:int,f3:int>,
int] (isAcidRead false)
> 2017-09-07T00:21:40,267 DEBUG [main] orc.OrcInputFormat: No ORC pushdown predicate
> 2017-09-07T00:21:40,267  INFO [main] orc.ReaderImpl: Reading ORC rows from hdfs://cluster-7199-m/user/hive/warehous
> e/test_broken_struct/000000_0 with {include: [true, true, true, true, true], offset:
3, length: 159, schema: struct
> <a:struct<f1:int,f2:int,f3:int>,b:int>}
> Failed with exception java.io.IOException:java.lang.ArrayIndexOutOfBoundsException: 5
> 2017-09-07T00:21:40,273 ERROR [main] CliDriver: Failed with exception java.io.IOException:java.lang.ArrayIndexOutOf
> BoundsException: 5
> java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 5
>         at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
>         at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428)
>         at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
>         at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2098)
>         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:252)
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>         at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 5
>         at org.apache.orc.impl.SchemaEvolution.buildConversionFileTypesArray(SchemaEvolution.java:195)
>         at org.apache.orc.impl.SchemaEvolution.buildConversionFileTypesArray(SchemaEvolution.java:253)
>         at org.apache.orc.impl.SchemaEvolution.<init>(SchemaEvolution.java:59)
>         at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:149)
>         at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:63)
>         at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:87)
>         at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:314)
>         at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.<init>(OrcInputFormat.java:225)
>         at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1691)
>         at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:69
> 5)
>         at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:333)
>         at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:459)
>         ... 15 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message