hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesus Camacho Rodriguez (JIRA)" <>
Subject [jira] [Commented] (HIVE-11652) Avoid expensive call to removeAll in DefaultGraphWalker
Date Wed, 21 Dec 2016 14:51:58 GMT


Jesus Camacho Rodriguez commented on HIVE-11652:

[~dhiraj.kumar], thanks for creating the issue.

I meant keeping position per node, not only a single position for last node. That is why it
would add memory pressure.

In fact that finding position itself is not a problem, but ASTNode.getChildren() invocation
is problem.
Yes, that was my guess when I checked the code and probably we should focus on this part.
Each time we call that method, a new list is created and its children are copied to it; I
do not not know the original reason why we overwrote _getChildren_ to do that in every call
and if there is actually a good reason for that. I would expect that by default, we return
the original list; in turn, we return a copy if it is explicitly pointed out (via boolean).

In any case, we can continue the discussion in HIVE-15486.

> Avoid expensive call to removeAll in DefaultGraphWalker
> -------------------------------------------------------
>                 Key: HIVE-11652
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Logical Optimizer, Physical Optimizer
>    Affects Versions: 1.3.0, 2.0.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>             Fix For: 1.3.0, 2.0.0
>         Attachments: HIVE-11652.01.patch, HIVE-11652.02.patch, HIVE-11652.patch
> When the plan is too large, the removeAll call in DefaultGraphWalker (line 140) will
take very long as it will have to go through the list looking for each of the nodes. We try
to get rid of this call by rewriting the logic in the walker.

This message was sent by Atlassian JIRA

View raw message