phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <>
Subject [jira] [Commented] (PHOENIX-946) Use Phoenix to service Hive queries over HBase data
Date Wed, 23 Apr 2014 16:37:17 GMT


Andrew Purtell commented on PHOENIX-946:

bq. I don't see all that much use in being able to parse HiveQL or tying in to the HiveMetastore/HCat
from Phoenix, as that would just result in a Hive-similar query tool that can only query Phoenix-stored
HBase data (as opposed to what Hive can do, which is query pretty much anything)

Both Hive and Phoenix at the client are JDBC drivers. One could imagine as a starting point
an uber driver that accepts some dialect of SQL, rewrites as needed, and dispatches to either
Phoenix or Hive, then does a local merge of results. Because this would be a facade over gritty
details further evolution would not be burdensome on the user. Then you might look at treating
the Phoenix back end hosted in HBase as an acceleration service and push integration further
in from the Hive client to the storage engine layer. Call that HivePhoenixHandler. That would
be an interesting challenge. You'd want as much query work done on the server side, and heuristics
for joining would be fun. Phoenix will have some native support for joining between HBase
data sources. Hive might rewrite joins between HBase and other data sources to offload HBase
work to the Phoenix server pieces and use its Tez or MR backed workflow to join the results
of HBase data source joins to the other non-HBase data sources. Sounds fun.

> Use Phoenix to service Hive queries over HBase data
> ---------------------------------------------------
>                 Key: PHOENIX-946
>                 URL:
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor

This message was sent by Atlassian JIRA

View raw message