hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dapeng Sun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15682) Eliminate per-row based dummy iterator creation
Date Wed, 08 Feb 2017 05:52:41 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857457#comment-15857457
] 

Dapeng Sun commented on HIVE-15682:
-----------------------------------

Hi [~xuefuz], I will use TPCx-BB to run 1TB test about HIVE-15580,  HIVE-15682 and no patched
package, I would attach the result when I get it.

> Eliminate per-row based dummy iterator creation
> -----------------------------------------------
>
>                 Key: HIVE-15682
>                 URL: https://issues.apache.org/jira/browse/HIVE-15682
>             Project: Hive
>          Issue Type: Improvement
>          Components: Spark
>    Affects Versions: 2.2.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>             Fix For: 2.2.0
>
>         Attachments: HIVE-15682.patch
>
>
> HIVE-15580 introduced a dummy iterator per input row which can be eliminated. This is
because {{SparkReduceRecordHandler}} is able to handle single key value pairs. We can refactor
this part of code 1. to remove the need for a iterator and 2. to optimize the code path for
per (key, value) based (instead of (key, value iterator)) processing. It would be also great
if we can measure the performance after the optimizations and compare to performance prior
to HIVE-15580.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message