hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vihang Karajgaonkar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-21596) HiveMetastoreClient should be able to connect to older metastore servers
Date Fri, 19 Apr 2019 00:42:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-21596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vihang Karajgaonkar updated HIVE-21596:
---------------------------------------
    Description: 
{{HiveMetastoreClient}} currently depends on the fact that both the client and server versions
are the same. Additionally, since the server APIs are backwards compatible, it is possible
for a older client (eg. 2.1.0 client version) to connect to a newer server (eg. 3.1.0 server
version) without any issues. This is useful in setups where HMS is deployed in a remote mode
and clients connect to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can connect
to the a older server version. When a newer client is talking to a older server following
things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
 In such a case, thrift will throw {{Invalid method name}} exception which should be automatically
be handled by the clients since each API already throws TException.

2. Client invokes a RPC using thrift objects which has new fields added.
 When a new field is added to a thrift object, the server does not deserialize the field in
the first place since it does not know about that field id. So the wire-compatibility exists
already. However, the client side application should understand the implications of such a
behavior. In such cases, it would be better for the client to throw exception by checking
the server version which was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using newer thrift API
the client will start seeing exception {{Invalid method name}} since the older server does
not have such a method.
 This can be handled on the client side by making sure that the newer implementation is conditional
to the server version. Which means client should check the server version and invoke the new
implementation only if the server version supports the newer API. (On a side note, it would
be great if metastore also gives information of which APIs are supported for a given version)

4. In such cases the API itself is binary compatible (method signature is same) but it is
semantically different than previous versions. For example, in Hive-3 the {{getAllDatabases()}} API
appends the default catalog name to the request which a older server will interpret as a pattern
for the dbName and will return nothing.

In such case, the client should make sure that the server is at the newer version or else
fall-back to the older semantically equivalent call. In this example, it should fallback to
the old {{get_all_databases}} API call.

One of the real world use-case of such a feature is in Impala which wants to have capability
to talk to both HMS 2.x and HMS 3.x. But other applications like Spark (or third party applications
which want to support multiple HMS versions) may also find this useful.

Also, this patch will do a best effort to fix all such cases between Hive 2.3.0 and newer
versions of HMS. It should be a on-going effort to be exhaustive. We will also need to add
support for this in our test infrastructure to spin up older HMS server versions and test
using newer clients APIs. I will create a separate sub-task for that since it may need more
plumbing in ptest.

  was:
{{HiveMetastoreClient}} currently depends on the fact that both the client and server versions
are the same. Additionally, since the server APIs are backwards compatible, it is possible
for a older client (eg. 2.1.0 client version) to connect to a newer server (eg. 3.1.0 server
version) without any issues. This is useful in setups where HMS is deployed in a remote mode
and clients connect to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can connect
to the a older server version. When a newer client is talking to a older server following
things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which should be automatically
be handled by the clients since each API already throws TException.

2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize the field in
the first place since it does not know about that field id. So the wire-compatibility exists
already. However, the client side application should understand the implications of such a
behavior. In such cases, it would be better for the client to throw exception by checking
the server version which was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using newer thrift API
the client will start seeing exception {{Invalid method name}} since the older server does
not have such a method.
This can be handled on the client side by making sure that the newer implementation is conditional
to the server version. Which means client should check the server version and invoke the new
implementation only if the server version supports the newer API. (On a side note, it would
be great if metastore also gives information of which APIs are supported for a given version)

One of the real world use-case of such a feature is in Impala which wants to have capability
to talk to both HMS 2.x and HMS 3.x. But other applications like Spark (or third party applications
which want to support multiple HMS versions) may also find this useful.

Also, this patch will do a best effort to fix all such cases between Hive 2.3.0 and newer
versions of HMS. It should be a on-going effort to be exhaustive. We will also need to add
support for this in our test infrastructure to spin up older HMS server versions and test
using newer clients APIs. I will create a separate sub-task for that since it may need more
plumbing in ptest.


> HiveMetastoreClient should be able to connect to older metastore servers
> ------------------------------------------------------------------------
>
>                 Key: HIVE-21596
>                 URL: https://issues.apache.org/jira/browse/HIVE-21596
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>            Priority: Major
>
> {{HiveMetastoreClient}} currently depends on the fact that both the client and server
versions are the same. Additionally, since the server APIs are backwards compatible, it is
possible for a older client (eg. 2.1.0 client version) to connect to a newer server (eg. 3.1.0
server version) without any issues. This is useful in setups where HMS is deployed in a remote
mode and clients connect to it remotely.
> It would be a good improvement if a newer version {{HiveMetastoreClient }} can connect
to the a older server version. When a newer client is talking to a older server following
things can happen:
> 1. Client invokes a RPC to the older server which doesn't exist.
>  In such a case, thrift will throw {{Invalid method name}} exception which should be
automatically be handled by the clients since each API already throws TException.
> 2. Client invokes a RPC using thrift objects which has new fields added.
>  When a new field is added to a thrift object, the server does not deserialize the field
in the first place since it does not know about that field id. So the wire-compatibility exists
already. However, the client side application should understand the implications of such a
behavior. In such cases, it would be better for the client to throw exception by checking
the server version which was added in HIVE-21484
> 3. If the newer client has re-implemented a certain API, for example, using newer thrift
API the client will start seeing exception {{Invalid method name}} since the older server
does not have such a method.
>  This can be handled on the client side by making sure that the newer implementation
is conditional to the server version. Which means client should check the server version and
invoke the new implementation only if the server version supports the newer API. (On a side
note, it would be great if metastore also gives information of which APIs are supported for
a given version)
> 4. In such cases the API itself is binary compatible (method signature is same) but it
is semantically different than previous versions. For example, in Hive-3 the {{getAllDatabases()}} API
appends the default catalog name to the request which a older server will interpret as a pattern
for the dbName and will return nothing.
> In such case, the client should make sure that the server is at the newer version or
else fall-back to the older semantically equivalent call. In this example, it should fallback
to the old {{get_all_databases}} API call.
> One of the real world use-case of such a feature is in Impala which wants to have capability
to talk to both HMS 2.x and HMS 3.x. But other applications like Spark (or third party applications
which want to support multiple HMS versions) may also find this useful.
> Also, this patch will do a best effort to fix all such cases between Hive 2.3.0 and newer
versions of HMS. It should be a on-going effort to be exhaustive. We will also need to add
support for this in our test infrastructure to spin up older HMS server versions and test
using newer clients APIs. I will create a separate sub-task for that since it may need more
plumbing in ptest.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message