hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
Date Fri, 27 Jul 2018 17:02:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559999#comment-16559999
] 

Hive QA commented on HIVE-19927:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933322/HIVE-19927.01-branch-3.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 14410 tests executed
*Failed tests:*
{noformat}
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=258)
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258)
TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258)
TestSparkStatistics - did not produce a TEST-*.xml file (likely timed out) (batchId=237)
TestTezPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_all] (batchId=70)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_all] (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_with_masking]
(batchId=174)
org.apache.hadoop.hive.ql.TestWarehouseExternalDir.testManagedPaths (batchId=235)
org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.testImpersonation (batchId=243)
org.apache.hive.spark.client.rpc.TestRpc.testServerPort (batchId=310)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12900/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12900/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12900/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933322 - PreCommit-HIVE-Build

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM
tables.
> -----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-19927
>                 URL: https://issues.apache.org/jira/browse/HIVE-19927
>             Project: Hive
>          Issue Type: Sub-task
>          Components: HiveServer2, repl, Transactions
>    Affects Versions: 3.1.0
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>            Priority: Major
>              Labels: DR, pull-request-available, replication
>             Fix For: 4.0.0
>
>         Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, HIVE-19927.02.patch,
HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips all the
data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is opened
after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn (Txn1) and
store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message