drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Rogers <par0...@yahoo.com.INVALID>
Subject Re: Apache Drill High Availability using HAproxy
Date Mon, 20 Aug 2018 18:01:51 GMT
Hi Satish,

You did not say if you are using HAProxy for the RESTful API or the native Drill RPC (as used
by the Drill client, JDBC and ODBC.)

To understand the use of proxies and load balancers, it is helpful to remember that Drill
is a stateful SQL engine. Drill encourages the use of many stateful commands such as USE,

Session state is lost when connecting to a new Drillbit, or reconnecting to the same Drillbit.
Thus, a query that runs fine before the reconnect can fail afterwards.

This issue is not unique to Drill; it is a common constraint of all old-school SQL engines.

If state were not an issue, then the Drill client itself could handle HA. The client is given
a list of ZK nodes. The client, on encountering a disconnect, could ask ZK for a new node
and reconnect. Since ZK is HA, the client can also recover from a ZK node failure by trying

We discussed this client-based HA approach multiple times, but each time, the SQL state has
been a show-stopper.

In short, the issue is not whether to use HAProxy to solve the problem; Drill can do it internally
in the client. The issue is how to handle session state.

A possible solution would be to store user session state in ZK so that we could re-establish
the same logical session after a physical reconnection. In particular a unique session ID
could be used to key connections to session state in ZK.

Making this change would be a good contributor project: it involves detailed knowledge of
how the Drill session and ZK state work, but is pretty isolated to just those specific areas. 
- Paul


    On Monday, August 20, 2018, 8:26:09 AM PDT, drill <ganesh.satish34@gmail.com> wrote:
 Hi Team,

Good Evening . I am Satish working as big data developer. I need your help regarding Drill
high availability usinh Ha proxy load balancer.
Is Apache drill supports High availability if yes please let me know the process.


Sent from Mail for Windows 10
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message