hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liyunzhang_intel (JIRA)" <>
Subject [jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]
Date Sat, 17 Jun 2017 05:33:01 GMT


liyunzhang_intel commented on HIVE-11297:

[~csun]: just 1 thing need to be confirmed:
Operator<?> filterOp = pruningSinkOp;
    Operator<?> selOp = null;
  while (filterOp != null) {
      if (filterOp.getNumChild() > 1) {
      } else {
        selOp = filterOp;
        filterOp = filterOp.getParentOperators().get(0);

Here the original code is find the filterOp from pruningSinkOp(tranverse back award).  why
need rename filterOp to something else?  I think we can remove selOp here because it will
not used anymore.   If my understanding is wrong, please tell me .

> Combine op trees for partition info generating tasks [Spark branch]
> -------------------------------------------------------------------
>                 Key: HIVE-11297
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: spark-branch
>            Reporter: Chao Sun
>            Assignee: liyunzhang_intel
>         Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch, HIVE-11297.3.patch, HIVE-11297.4.patch,
> Currently, for dynamic partition pruning in Spark, if a small table generates partition
info for more than one partition columns, multiple operator trees are created, which all start
from the same table scan op, but have different spark partition pruning sinks.
> As an optimization, we can combine these op trees and so don't have to do table scan
multiple times.

This message was sent by Atlassian JIRA

View raw message