phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kadir OZDEMIR (Jira)" <j...@apache.org>
Subject [jira] [Updated] (PHOENIX-5535) Replay delete markers during server side global index rebuild
Date Sun, 27 Oct 2019 06:12:00 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kadir OZDEMIR updated PHOENIX-5535:
-----------------------------------
    Summary: Replay delete markers during server side global index rebuild   (was: Index rebuilds
via UngroupedAggregateRegionObserver should replay delete markers)

> Replay delete markers during server side global index rebuild 
> --------------------------------------------------------------
>
>                 Key: PHOENIX-5535
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5535
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0, 4.14.3
>            Reporter: Kadir OZDEMIR
>            Assignee: Kadir OZDEMIR
>            Priority: Blocker
>             Fix For: 4.15.0, 5.1.0
>
>         Attachments: PHOENIX-5535.4.x-HBase-1.5.001.patch, PHOENIX-5535.master.001.patch,
PHOENIX-5535.master.002.patch, PHOENIX-5535.master.003.patch
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently index rebuilds for global index tables are done on the server side. Phoenix
client generates an aggregate plan using ServerBuildIndexCompiler to scan every data table
row on the server side . This complier sets the scan attributes so that the row mutations
that are scanned by UngroupedRegionObserver are then replayed on the data table so that index
table rows are rebuilt. During this replay, data table row updates are skipped and only index
table row are updated.
> Phoenix allows column entries to have null values. Null values are represented by HBase
column delete marker. This means that index rebuild must replay these delete markers along
with put mutations. In order to do that ServerBuildIndexCompiler should use raw scans but
currently it does use regular scans. This leads incorrect index rebuilds when null values
are used.
> A simple test where a data table with one global index with a covered column that can
take null value is sufficient to reproduce this problem.
>  # Create a data table with columns  a,  b, and c where a is the primary key and c
can have null value
>  # Write one row with not null values
>  # Overwrite the covered column with null (i.e., set it to null) 
>  # Create an index on the table where b is the secondary key and c is covered column
>  # Rebuild the index
>  # Dump the index table
> The index table row should have the null value for the covered column. However, it has
the not null value written at step 2.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message