GitHub user vkorukanti opened a pull request:
https://github.com/apache/drill/pull/215
DRILL-3739: Fix issues in reading Hive tables with StorageHandler configuration (eg. Hive-HBase
tables)
Issue is for Hive tables with custom storage handlers (such as HBase backed Hive tables),
InputFormat class is not stored in StorageDescriptor in Hive metastore. Instead it is retrieved
from the StorageHandler.getInputFormatClass. This is new change made in Hive after Hive 0.13.
Fix is: if we can't find the InputFormat class in metastore, create StorageHandler instance
of the table and get the InputFormat from the instance. If the StorageHandler doesn't exists
throw an exception. Behavior here is similar to Hive.
In order for Hive-HBase tables to work following config properties need to be added to
Hive storage plugin config section:
"hbase.zookeeper.quorum": "zkhost1,zkhost2,zkhost3",
"hbase.zookeeper.property.clientPort": "2181" // ZooKeeper port
These properties are expected by HBaseStorageHandler to discover the HBase services.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/vkorukanti/drill DRILL-3739
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/215.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #215
----
commit 60dfac886a01f53ed94ab146bedbb16fdc90427f
Author: vkorukanti <venki.korukanti@gmail.com>
Date: 2015-10-19T18:35:09Z
DRILL-3938: Support reading from Hive tables that have schema altered after the creation
Also:
+ Remove "redoRecord" logic which is not needed after "automatic reallocation" (DRILL-1960)
changes.
+ Remove HiveTestRecordReader. This is incomplete in implementation and not used anywhere.
It is currently just
a burden to maintain with changes in its superclass HiveRecordReader
commit 2b28eab82f6c34bc7a27c96ddd3caf7371529f7f
Author: vkorukanti <venki.korukanti@gmail.com>
Date: 2015-10-20T23:21:09Z
DRILL-3893: Change Hive metadata cache invalidation policy to "1 min after last write".
commit 5d35df1b085f7c4add207c3017a08a531da65dee
Author: vkorukanti <venki.korukanti@gmail.com>
Date: 2015-10-21T18:01:23Z
DRILL-3739: Fix issues in reading Hive tables with StorageHandler configuration (eg. Hive-HBase
tables)
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
|