hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Work logged] (HIVE-21960) HMS tasks on replica
Date Mon, 29 Jul 2019 16:35:00 GMT


ASF GitHub Bot logged work on HIVE-21960:

                Author: ASF GitHub Bot
            Created on: 29/Jul/19 16:34
            Start Date: 29/Jul/19 16:34
    Worklog Time Spent: 10m 
      Work Description: ashutosh-bapat commented on pull request #735: HIVE-21960 : Avoid
running stats updater and partition management task on a replicated table.

 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/
 @@ -220,6 +221,16 @@ private void stopWorkers() {
     String skipParam = table.getParameters().get(SKIP_STATS_AUTOUPDATE_PROPERTY);
     if ("true".equalsIgnoreCase(skipParam)) return null;
+    // If the table is being replicated into,
+    // 1. the stats are also replicated from the source, so we don't need those to be calculated
+    //    on the target again
+    // 2. updating stats requires a writeId to be created. Hence writeIds on source and target
+    //    can get out of sync when stats are updated. That can cause consistency issues.
+    String replTrgtParam = table.getParameters().get(ReplConst.REPL_TARGET_PROPERTY);
 Review comment:
   That window is too small, but nevertheless finite. Should we add with value
0 or 1- when creating the table similar to what we are doing for the db?
   The other option is to check db level property e.g. checkpoint, but that means every time
we assess whether a table requires stats updater to work or partition mgmt task to work, it
fetches the Database object. For a small window corner case this looks pretty expensive.
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

Issue Time Tracking

    Worklog Id:     (was: 284385)
    Time Spent: 1h 10m  (was: 1h)

> HMS tasks on replica
> --------------------
>                 Key: HIVE-21960
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2, repl
>    Affects Versions: 4.0.0
>            Reporter: Ashutosh Bapat
>            Assignee: Ashutosh Bapat
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-21960.01.patch, HIVE-21960.02.patch, HIVE-21960.03.patch, Replication
and House keeping tasks.pdf
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
> An HMS performs a number of housekeeping tasks. Assess whether
>  # They are required to be performed in the replicated data
>  # Performing those on replicated data causes any issues and how to fix those.

This message was sent by Atlassian JIRA

View raw message