hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-20771) LazyBinarySerDe fails on empty structs.
Date Thu, 11 Jun 2020 10:56:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-20771?focusedWorklogId=444217&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-444217
]

ASF GitHub Bot logged work on HIVE-20771:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Jun/20 10:55
            Start Date: 11/Jun/20 10:55
    Worklog Time Spent: 10m 
      Work Description: zaboron commented on pull request #450:
URL: https://github.com/apache/hive/pull/450#issuecomment-642569230


   @jcamachor @belugabehr 
   I think this very much makes sense, it is a bugfix. The LazyBinarySerDe serializes empty
structs but is no longer able to deserialize said object. 
   A serde should always be able to deserialize objects it serialized?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 444217)
    Time Spent: 0.5h  (was: 20m)

> LazyBinarySerDe fails on empty structs.
> ---------------------------------------
>
>                 Key: HIVE-20771
>                 URL: https://issues.apache.org/jira/browse/HIVE-20771
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 1.2.2, 2.3.2, 3.1.0
>            Reporter: Clemens Valiente
>            Assignee: Clemens Valiente
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: HIVE-20771.patch
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code:java}
> CREATE TABLE cvaliente.structtest AS
> SELECT named_struct();
> SHOW CREATE TABLE cvaliente.structtest;
> SELECT * FROM cvaliente.structtest ORDER BY rand();
> {code}
> The resulting schema is:
> {code:sql}
> CREATE TABLE `cvaliente.structtest`(
>   `_c0` struct<>)
> ROW FORMAT SERDE 
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> STORED AS INPUTFORMAT 
>   'org.apache.hadoop.mapred.TextInputFormat' 
> OUTPUTFORMAT 
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'hdfs://nameservice1/user/cvaliente/cvaliente/structtest2'
> TBLPROPERTIES (
>   'COLUMN_STATS_ACCURATE'='true', 
>   'numFiles'='1', 	  
>   'numRows'='1', 
>   'rawDataSize'='0', 
>   'totalSize'='1', 	  
>   'transient_lastDdlTime'='1539781607');
> {code}
> Between the MAP and REDUCE phase hive serializes to LazyBinaryStruct and when trying
to read the same object back the {{SELECT}} query above fails:
> {code}
> 2018-10-17 14:32:02,298 [FATAL] [TezChild] |tez.ReduceRecordSource|: org.apache.hadoop.hive.ql.metadata.HiveException:
Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":0.13508293503238622},"value":{"_col0":{}}}
> 	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:338)
> 	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:259)
> 	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:169)
> 	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164)
> 	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
> 	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> 	at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> 	at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
> 	at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> 	at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> 	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating VALUE._col0
> 	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
> 	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:329)
> 	... 17 more
> Caused by: java.lang.RuntimeException: length should be positive!
> 	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryNonPrimitive.init(LazyBinaryNonPrimitive.java:54)
> 	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.init(LazyBinaryStruct.java:95)
> 	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
> 	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
> 	at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
> 	at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:98)
> 	at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
> 	at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
> 	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
> 	... 18 more
> {code}
> this is because the LazyBinaryNonPrimitive doesn't allow for empty structs in https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryNonPrimitive.java#L53



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message