drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vkorukanti <...@git.apache.org>
Subject [GitHub] drill pull request: DRILL-3739: Fix issues in reading Hive tables ...
Date Thu, 22 Oct 2015 00:26:11 GMT
GitHub user vkorukanti opened a pull request:


    DRILL-3739: Fix issues in reading Hive tables with StorageHandler configuration (eg. Hive-HBase

    Issue is for Hive tables with custom storage handlers (such as HBase backed Hive tables),
InputFormat class is not stored in StorageDescriptor in Hive metastore. Instead it is retrieved
from the StorageHandler.getInputFormatClass. This is new change made in Hive after Hive 0.13.
    Fix is: if we can't find the InputFormat class in metastore, create StorageHandler instance
of the table and get the InputFormat from the instance. If the StorageHandler doesn't exists
throw an exception. Behavior here is similar to Hive.
    In order for Hive-HBase tables to work following config properties need to be added to
Hive storage plugin config section:
        "hbase.zookeeper.quorum": "zkhost1,zkhost2,zkhost3",
        "hbase.zookeeper.property.clientPort": "2181" // ZooKeeper port
    These properties are expected by HBaseStorageHandler to discover the HBase services.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vkorukanti/drill DRILL-3739

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #215
commit 60dfac886a01f53ed94ab146bedbb16fdc90427f
Author: vkorukanti <venki.korukanti@gmail.com>
Date:   2015-10-19T18:35:09Z

    DRILL-3938: Support reading from Hive tables that have schema altered after the creation
    + Remove "redoRecord" logic which is not needed after "automatic reallocation" (DRILL-1960)
    + Remove HiveTestRecordReader. This is incomplete in implementation and not used anywhere.
It is currently just
      a burden to maintain with changes in its superclass HiveRecordReader

commit 2b28eab82f6c34bc7a27c96ddd3caf7371529f7f
Author: vkorukanti <venki.korukanti@gmail.com>
Date:   2015-10-20T23:21:09Z

    DRILL-3893: Change Hive metadata cache invalidation policy to "1 min after last write".

commit 5d35df1b085f7c4add207c3017a08a531da65dee
Author: vkorukanti <venki.korukanti@gmail.com>
Date:   2015-10-21T18:01:23Z

    DRILL-3739: Fix issues in reading Hive tables with StorageHandler configuration (eg. Hive-HBase


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message