hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-3768) MR-2450 introduced a significant performance regression
Date Tue, 31 Jan 2012 03:52:10 GMT
MR-2450 introduced a significant performance regression

                 Key: MAPREDUCE-3768
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3768
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mrv2
    Affects Versions: 0.23.1
            Reporter: Eli Collins
            Priority: Blocker

MAPREDUCE-2450 introduced, or at least triggers, a significant performance regression in Hive.
With MR-2450 the execution time of TestCliDriver.skewjoin goes from 2 minutes to 15 minutes.
Reverting this change from the build fixes the issue.

Here's the relevant query:

FROM src src1 JOIN src src2 ON (src1.key = src2.key)
INSERT OVERWRITE TABLE dest_j1 SELECT src1.key, src2.value; 

You can reproduce this by running the following from Hive 8.0 against Hadoop built from branch-23.

ant very-clean package test -Dtestcase=TestCliDriver -Dqfile=skewjoin.q

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message