hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <>
Subject [jira] [Commented] (HIVE-21771) Support partition filter (where clause) in REPL dump command (Bootstrap Dump)
Date Thu, 18 Jul 2019 09:58:00 GMT


Hive QA commented on HIVE-21771:

Here are the results of testing the latest attachment:

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 16680 tests executed
*Failed tests:*

Test results:
Console output:
Test logs:

Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed

This message is automatically generated.

ATTACHMENT ID: 12975123 - PreCommit-HIVE-Build

> Support partition filter (where clause) in REPL dump command (Bootstrap Dump)
> -----------------------------------------------------------------------------
>                 Key: HIVE-21771
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: HiveServer2, repl
>    Affects Versions: 4.0.0
>            Reporter: mahesh kumar behera
>            Assignee: mahesh kumar behera
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>         Attachments: HIVE-21771.01.patch, HIVE-21771.02.patch
>          Time Spent: 10m
>  Remaining Estimate: 0h
> *Bootstrap for managed table*
> User should be allowed to execute REPL DUMP with where clause. The where clause should
support filtering out partition from dump. Format of the where clause should be similar to
*"REPL DUMP dbname from 10 where 't0' where key < 10,'t1'* where key = 3, '(t2*)|'t3' where
key > 3".* For initial version, very basic filter condition will be supported and later
the complexity will be increased as and when required.
>  * From the AST generated for the where clause, extract the table information.
>  * Generate AST for each table.
>  * List the partition for each table using the AST generated for each table using the
  same metastore API used by select query.
>  * During bootstrap load use the partition list to dump the partitions.
>  * During incremental dump, use the list to filter out the event.
> In case of bootstrap load, all the tables of the database will be scanned and
>  * If table is not partitioned, then it will be dumped.
>  * If key provided in the filter condition for the table is not a partition column, then
dump will fail.
>  * If table is not mentioned in the where clause, then all partitions of the table will
be dumped.
>  * All the partitioned of the table satisfying the where clause will be dumped.
> *Incremental for managed table (Not part of this patch)*
> In case of Incremental Dump, the events from the notification log will be scanned and
once the partition spec is extracted from the event, the partition spec will be filtered
against the condition.
>  * If table is not partitioned then the event will be added to the dump.
>  * If key mentioned is not a partition column, then dump will fail.
>  * If the table is not mentioned in the filter then event will be added to the dump.
>  * If the event is multi partitioned, then the event will be added to the dump. (Filtering
out redundant partitions from message will be done as part of separate task).
>  * If the partition spec matches the filter, then the event will be added to the dump*.*

This message was sent by Atlassian JIRA

View raw message