phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Johnson (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PHOENIX-1247) Join using sorted data
Date Wed, 10 Sep 2014 20:57:33 GMT
Brian Johnson created PHOENIX-1247:
--------------------------------------

             Summary: Join using sorted data
                 Key: PHOENIX-1247
                 URL: https://issues.apache.org/jira/browse/PHOENIX-1247
             Project: Phoenix
          Issue Type: New Feature
            Reporter: Brian Johnson


Similar to pig merge join, Phoenix should have a join that takes advantage of the sorted nature
of hbase keys. If you have two tables that have a column which is sorted the same as the rowkey,
you can join them efficiently without keeping either table in RAM. This also depends on using
a split policy which ensures the keys will be in the same region like DelimitedKeyPrefixRegionSplitPolicy

As an example, we keep user data in hbase where the first part of the key is the user id and
the second part makes it unique for each event. We then have a column which is just the user
id which will always be sorted because of the rowkey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message