phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajeshbabu Chintaguntla (JIRA)" <>
Subject [jira] [Commented] (PHOENIX-1734) Local index improvements
Date Tue, 03 Nov 2015 02:47:27 GMT


Rajeshbabu Chintaguntla commented on PHOENIX-1734:

bq. Each data table column family will have a corresponding local index column family formed
by prefixing the data table column family with a known prefix. The local index column families
will essentially be hidden from Phoenix.
bq. When a row is written to the data table, you write a corresponding row into the hidden
local index column family, prefixed with the region start key (i.e. the rows have different
row keys).
bq. Use a custom split policy to ensure that the local index column family does not get split.
Instead, you drive the split from the split of any data column family.
No. in HBase we have an optimization like if the hfile max key is less than split row or hfile
min key is more than split row we need not create reference files so that need not run compaction
for the hfiles. The file will be just moved to top or bottom region. But in case of local
indexing mostly all the rowkeys will be less than split row irrespective of actual rowkeys.
So the custom split policy helps to create both reference files for top and bottom daughter
regions so that the data will be split based on actual rowkey by  IndexHalfStoreFileReader.
It's same as current implementation of splitting index hfiles. 
The hfile splitting happen as below.
-- IndexHalfStoreReader go through all the keyvalues the hfile.
-- For each keyvalue we get actual rowkey and if it's less than split row it will be given
to top daughter region otherwise it's belong to bottom daughter region.
-- After splitting when compaction going on for top daughter region gets only the keyvalues
whose actual rowkey less than split row and bottom daughter region gets only the keyvalues
whose actual rowkey is more than split row.
bq. The "magic" is in the write of the local index rows is here:
The code is dead code.That's too much internal. I have commented out calling it.  HBase doen't
allow to write mutations to same region in (pre/post)batchMutate coprocessors so postponed
writing index updates to to post(Put/Delete). 

> Local index improvements
> ------------------------
>                 Key: PHOENIX-1734
>                 URL:
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Rajeshbabu Chintaguntla
>            Assignee: Rajeshbabu Chintaguntla
>         Attachments: PHOENI-1734-WIP.patch
> Local index design considerations: 
>  1. Colocation: We need to co-locate regions of local index regions and data regions.
The co-location can be a hard guarantee or a soft (best approach) guarantee. The co-location
is a performance requirement, and also maybe needed for consistency(2). Hard co-location means
that either both the data region and index region are opened atomically, or neither of them
open for serving. 
>  2. Index consistency : Ideally we want the index region and data region to have atomic
updates. This means that they should either (a)use transactions, or they should (b)share the
same WALEdit and also MVCC for visibility. (b) is only applicable if there is hard colocation
>  3. Local index clients : How the local index will be accessed from clients. In case
of the local index being managed in a table, the HBase client can be used for doing scans,
etc. If the local index is hidden inside the data regions, there has to be a different mechanism
to access the data through the data region. 
> With the above considerations, we imagine three possible implementation for the local
index solution, each detailed below. 
> APPROACH 1:  Current approach
> (1) Current approach uses balancer as a soft guarantee. Because of this, in some rare
cases, colocation might not happen. 
> (2) The index and data regions do not share the same WALEdits. Meaning consistency cannot
be achieved. Also there are two WAL writes per write from client. 
> (3) Regular Hbase client can be used to access index data since index is just another
> APPROACH 2: Shadow regions + shared WAL & MVCC 
> (1) Introduce a shadow regions concept in HBase. Shadow regions are not assigned by AM.
Phoenix implements atomic open (and split/merge) of region opening for data regions and index
regions so that hard co-location is guaranteed. 
> (2) For consistency requirements, the index regions and data regions will share the same
WALEdit (and thus recovery) and they will also share the same MVCC mechanics so that index
update and data update is visible atomically. 
> (3) Regular Hbase client can be used to access index data since index is just another
> APPROACH 3: Storing index data in separate column families in the table.
>  (1) Regions will have store files for cfs, which is sorted using the primary sort order.
Regions may also maintain stores, sorted in secondary sort orders. This approach is similar
in vein how a RDBMS keeps data (a B-TREE in primary sort order and multiple B-TREEs in secondary
sort orders with pointers to primary key). That means store the index data in separate column
families in the data region. This way a region is extended to be more similar to a RDBMS (but
LSM instead of BTree). This is sometimes called shadow cf’s as well. This approach guarantees
hard co-location.
>  (2) Since everything is in a single region, they automatically share the same WALEdit
and MVCC numbers. Atomicity is easily achieved. 
>  (3) Current Phoenix implementation need to change in such a way that column families
selection in read/write path is based data table/index table(logical table in phoenix). 
> I think that APPROACH 3 is the best one for long term, since it does not require to change
anything in HBase, mainly we don't need to muck around with the split/merge stuff in HBase.
It will be win-win.
> However, APPROACH 2 still needs a “shadow regions” concept to be implemented in HBase
itself, and also a way to share WALEdits and MVCCs from multiple regions.
> APPROACH 1 is a good start for local indexes, but I think we are not getting the full
benefits for the feature. We can support this for the short term, and decide on the next steps
for a longer term implementation. 
> we won't be able to get to implementing it immediately, and want to start a brainstorm.

This message was sent by Atlassian JIRA

View raw message