calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <jh...@apache.org>
Subject Re: Avatica client can not talk to Multiple Avatica Servers issue
Date Thu, 09 Aug 2018 17:28:35 GMT
One important question is: what state is shared among servers? Some pieces of state: connection
state, statement state, result set state (parameter values and position in a scroll). It is
not unusual for a cluster of servers to use a shared cache for some of the larger / slower-changing
pieces of state.

I could imagine a system where connection and statements are shared but result set state is
not. I could imagine another system where nothing is shared. The solution would be different
for those cases.

A possible technical solution in Avatica might be for Avatica to transmit all necessary state
in the return from each RPC, and the next RPC to transmit that state. Thus all necessary state
is on the client, and in each RPC call.

But such a solution would not be easy to implement, and would not perform as well as a system
that makes some reasonable assumptions about what state can be left on the server.

Julian




> On Aug 9, 2018, at 8:43 AM, Josh Elser <josh.elser@gmail.com> wrote:
> 
> Hi Jiandan,
> 
> Glad you found my write-up on this. One of the original design goals was to *not* implement
routing logic in the client. Sticky-sessions is by far the easiest way to implement this.
> 
> There is some retry logic in the Avatica client to resubmit requests when a server responds
that it doesn't have a connection/statement cached that the client thinks it should (e.g.
the load balancer flipped the client to a newer server). I'm still a little concerned about
this level of "smarts" :)
> 
> I don't know if there is a fancier solution that we can do in Avatica. We could consider
sharing state between Avatica servers, but I think it is database-dependent as to whether
or not you could correctly reconstruct an iteration through a result set.
> 
> I had talked with a dev on the Apache Hive project. He suggested that HiveServer2 just
fails the query when the client is mid-query and the server dies (which is reasonably -- servers
failing should be an infrequent operation).
> 
> 
> On 8/8/18 8:09 PM, JD Zheng wrote:
>> Hi,
>> Our query engine is using calcite as parser/optimizer and enumerable as runtime if
needed to federate different storage engines. We are trying to enable JDBC access to our query
engine. Everything works smoothly when we only have one calcite/avatica server.
>> However, JDBC calls will fail if we run multiple instances of calcite/avatica servers
behind a generic load-balancer. Given that JDBC server is not stateless, this problem was
not a surprise. I searched around and here are the two options suggested by phoenix developers
(https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html
<https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html>):
>> 1. sticky sessions: make the router to always route a client to a given server.
>> 2. client-driven routing: implementing Avarice’s protocol which passes an identifier
to the load balancer to control how the request is routed to the backend servers.
>> Before we rush into any implementation, we would really appreciate it if anyone can
share experience or thoughts regarding this issue. Thanks,
>> -Jiandan


Mime
View raw message