hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rui Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-18148) NPE in SparkDynamicPartitionPruningResolver
Date Wed, 20 Dec 2017 07:54:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297994#comment-16297994
] 

Rui Li commented on HIVE-18148:
-------------------------------

Both the target table size and the DPP sink output size (smaller output means more partitions
are pruned) should be taken into account, if we want to base the decision on statistics. Besides
we also need to consider the cost of re-computing, as I mentioned above. Let's put that as
follow up.

> NPE in SparkDynamicPartitionPruningResolver
> -------------------------------------------
>
>                 Key: HIVE-18148
>                 URL: https://issues.apache.org/jira/browse/HIVE-18148
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Rui Li
>            Assignee: Rui Li
>         Attachments: HIVE-18148.1.patch, HIVE-18148.2.patch
>
>
> The stack trace is:
> {noformat}
> 2017-11-27T10:32:38,752 ERROR [e6c8aab5-ddd2-461d-b185-a7597c3e7519 main] ql.Driver:
FAILED: NullPointerException null
> java.lang.NullPointerException
>         at org.apache.hadoop.hive.ql.optimizer.physical.SparkDynamicPartitionPruningResolver$SparkDynamicPartitionPruningDispatcher.dispatch(SparkDynamicPartitionPruningResolver.java:100)
>         at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>         at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>         at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>         at org.apache.hadoop.hive.ql.optimizer.physical.SparkDynamicPartitionPruningResolver.resolve(SparkDynamicPartitionPruningResolver.java:74)
>         at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeTaskPlan(SparkCompiler.java:568)
> {noformat}
> At this stage, there shouldn't be a DPP sink whose target map work is null. The root
cause seems to be a malformed operator tree generated by SplitOpTreeForDPP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message