hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.
Date Sat, 13 Apr 2019 17:35:01 GMT

     [ https://issues.apache.org/jira/browse/HIVE-21529?focusedWorklogId=227203&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227203
]

ASF GitHub Bot logged work on HIVE-21529:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Apr/19 17:34
            Start Date: 13/Apr/19 17:34
    Worklog Time Spent: 10m 
      Work Description: sankarh commented on pull request #581: HIVE-21529 : Bootstrap ACID
tables as part of incremental dump.
URL: https://github.com/apache/hive/pull/581#discussion_r275125530
 
 

 ##########
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##########
 @@ -137,17 +137,69 @@ private void prepareReturnValues(List<String> values) throws
SemanticException {
     Utils.writeOutput(values, new Path(work.resultTempPath), conf);
   }
 
+  /**
+   * Decide whether to examine all the tables to dump. We do this if
+   * 1. External tables are going to be part of the dump : In which case we need to list
their
+   * locations.
+   * 2. External or ACID tables are being bootstrapped for the first time : so that we can
dump
+   * those tables as a whole.
+   * @return
+   */
+  private boolean shouldDumpExternalTableLocation() {
+    return conf.getBoolVar(HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES)
+            && (!conf.getBoolVar(HiveConf.ConfVars.REPL_DUMP_METADATA_ONLY)
+            || conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES));
+  }
+
+  private boolean shouldExamineTablesToDump() {
+    return shouldDumpExternalTableLocation() ||
+            conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_ACID_TABLES);
+  }
+
+  private boolean shouldBootstrapDumpTable(Table table) {
+    if (conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES) &&
+            TableType.EXTERNAL_TABLE.equals(table.getTableType())) {
+      return true;
+    }
+
+    if (conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_ACID_TABLES) &&
+           AcidUtils.isTransactionalTable(table)) {
+      return true;
+    }
+
+    return false;
+  }
+
   private Long incrementalDump(Path dumpRoot, DumpMetaData dmd, Path cmRoot, Hive hiveDb)
throws Exception {
     Long lastReplId;// get list of events matching dbPattern & tblPattern
     // go through each event, and dump out each event to a event-level dump dir inside dumproot
+    ValidTxnList validTxnList = null;
+    long waitUntilTime = 0;
+    long bootDumpBeginReplId = -1;
+
+    // If we are bootstrapping ACID tables, we need to perform steps similar to a regular
+    // bootstrap (See bootstrapDump() for more details. Only difference here is instead of
+    // waiting for the concurrent transactions to finish, we start dumping the incremental
events
+    // and wait only for the remaining time if any.
+    if (conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_ACID_TABLES)) {
+      bootDumpBeginReplId = queryState.getConf().getLong(ReplUtils.LAST_REPL_ID_KEY, -1L);
+      assert (bootDumpBeginReplId >= 0);
+      LOG.info("Dump for bootstrapping ACID tables during an incremental dump for db {} and
table {}",
+              work.dbNameOrPattern,
+              work.tableNameOrPattern);
+      validTxnList = getTxnMgr().getValidTxns();
+      long timeoutInMs = HiveConf.getTimeVar(conf,
+              HiveConf.ConfVars.REPL_BOOTSTRAP_DUMP_OPEN_TXN_TIMEOUT, TimeUnit.MILLISECONDS);
+      waitUntilTime = System.currentTimeMillis() + timeoutInMs;
+    }
 
     // TODO : instead of simply restricting by message format, we should eventually
     // move to a jdbc-driver-stype registering of message format, and picking message
     // factory per event to decode. For now, however, since all messages have the
     // same factory, restricting by message format is effectively a guard against
     // older leftover data that would cause us problems.
 
-    work.overrideEventTo(hiveDb);
+    work.overrideEventTo(hiveDb, bootDumpBeginReplId);
 
 Review comment:
   work.maxEventLimit() should limit the events less than bootDumpBeginReplId. Need to set
it properly.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 227203)
    Time Spent: 3.5h  (was: 3h 20m)

> Hive support bootstrap of ACID/MM tables on an existing policy.
> ---------------------------------------------------------------
>
>                 Key: HIVE-21529
>                 URL: https://issues.apache.org/jira/browse/HIVE-21529
>             Project: Hive
>          Issue Type: Sub-task
>          Components: repl, Transactions
>    Affects Versions: 4.0.0
>            Reporter: Sankar Hariappan
>            Assignee: Ashutosh Bapat
>            Priority: Major
>              Labels: DR, pull-request-available, replication
>         Attachments: HIVE-21529.01.patch, HIVE-21529.02.patch, HIVE-21529.03.patch
>
>          Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an existing repl
policy, then need to combine bootstrap dump of these tables along with the ongoing incremental
dump. 
>  Shall add a one time config "hive.repl.bootstrap.acid.tables" to include bootstrap in
the given dump.
> TheĀ support for hive.repl.bootstrap.cleanup.type for ACID tables to clean-up partially
bootstrapped tables in case of retry is already in place, thanks to the work done during external
tables. Need to test that it actually works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message