hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <>
Subject [jira] [Commented] (HIVE-20382) Materialized views: Introduce heuristic to favour incremental rebuild
Date Tue, 02 Apr 2019 03:57:00 GMT


Hive QA commented on HIVE-20382:

Here are the results of testing the latest attachment:

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15890 tests executed
*Failed tests:*
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_mv] (batchId=195)
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testMetastoreTablesCleanup (batchId=327)

Test results:
Console output:
Test logs:

Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed

This message is automatically generated.

ATTACHMENT ID: 12964486 - PreCommit-HIVE-Build

> Materialized views: Introduce heuristic to favour incremental rebuild
> ---------------------------------------------------------------------
>                 Key: HIVE-20382
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Materialized views
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>            Priority: Major
>         Attachments: HIVE-20382.01.patch, HIVE-20382.patch, HIVE-20382.patch
> Currently, we do not expose stats over ROW\_\_ID.writeId to the optimizer (this should
be fixed by HIVE-20313). Even if we did, we always assume uniform distribution of the column
values, which can easily lead to overestimations on the number of rows read when we filter
on ROW\_\_ID.writeId for materialized views (think about a large transaction for MV creation
and then small ones for incremental maintenance). This overestimation can lead to incremental
view maintenance not being triggered as cost of the incremental plan is overestimated (we
think we will read more rows than we actually do). This could be fixed by introducing histograms
that reflect better the column values distribution.
> Till both fixes are implemented, we will use a config variable that will multiply the
estimated cost of the rebuild plan and hence will be able to favour incremental rebuild over
full rebuild.

This message was sent by Atlassian JIRA

View raw message