drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Rogers <prog...@mapr.com>
Subject Re: Drill Session ID between Nodes
Date Fri, 23 Jun 2017 22:57:45 GMT
I stand corrected on one point (Thanks, Sorabh!): the Drill web server does have a session
timeout, configurable in boot options, that defaults to one hour.

- Paul

> On Jun 23, 2017, at 2:10 PM, Paul Rogers <progers@mapr.com> wrote:
> 
> Hi John,
> 
> Your use case is interesting. I’m certainly not an expert in the network aspects of
what you are trying to do, but I can take a short at the related Drill issues.
> 
> Drill’s primary use case is connecting via the Drill client (typically via JDBC or
ODBC.) The Drill client handles security. It also allows SQL sessions, and hence session options.
> 
> Your use case is based on the REST API. At present, the REST API is best described as
a “prototype.” REST supports username/password login, and sessions associated with the
login (on a single Drillbit). Sessions never timeout (as far as I can tell.) More importantly,
the REST API returns all query results in a single message, encoded as JSON. This is great
for small queries, but does not scale well when returning millions of large rows. (Hint: we
are looking for contributions to improve the REST API!)
> 
> As Keys pointed out, the important question is this: does your app need session state
other than security? If so, then you need to consider overall SQL session state, not just
SSL connections. If your script does “ALTER SESSION” followed by a query, then the ALTER
SESSION might be sent to node A, with the query going to node B. Node B does not know about
the session on A, and so results will be different than what you expect. The same is true
with temp tables.
> 
> Said another way, you’d like your scripts to do round-robin per *request*, but Drill
is designed to do round-robin *per session.* (The Drill client, when using ZK, does random
selection of nodes that achieves roughly the same result.) In short, your use case is clear,
but is not supported today in Drill.
> 
> Putting this together:
> 
> 1. Sessions must be sticky to a single Drillbit so that session state, temp tables and
so on are persisted (on that one Drillbit.)
> 2. If a session on one Drillbit drops, the app must establish a new session on another
Drillbit. That involves not just security tokens and cookies, but also resetting session options,
rebuilding temp tables, etc.
> 3. Since the app has to handle session recreation when switching Drillbits, the security
issue, while a nuisance, is a necessary result of switching sessions.
> 4. (As Keys points out,) changing sessions is a rare event (due to timeouts, node failures,
etc.) so session recovery should be rare.
> 
> The only way to make sessions “portable” is to create a shared, global session shared
across Drillbits, which is what you are proposing. Doing so is non-trivial: it requires a
global session registry (or a way of synchronizing session state). Such sharing is not supported
in Drill’s distributed, shared-nothing architecture. Could we add it? Probably, but not
in the short term. If we ever find the need for a “metastore” (or central work scheduler),
then at that time Drill would have a mechanism to support session portability; but that is
a ways off.
> 
> For the short term, can you perhaps rethink the use case given that sessions are local?
How will your app handle failover? Is the security issue as much of a problem when seen as
part of session recreation? (I’m not an expert here; I’m asking how this might work: are
there things, short of persistent sessions, we can do to help?)
> 
> You mentioned Drill-on-YARN (DoY). DoY is an interesting question. On the surface, REST
works the same on DoY as in “regular” Drill: the REST endpoint doesn’t care how the
Drillbit was launched. Whatever works with regular Drill will work with DoY. Under DOY, J/ODBC
clients work as usual: they maintain a session with one Drillbit, and use ZK to find a fall-back
Drillbit if the first one fails (with the need for the client to re-establish the SQL session
state by resending session options, etc.) Can we improve this? Yes, if we did the work described
earlier.
> 
> (BTW: I’m still looking for volunteers to help with code reviews so we can contribute
DoY to Apache Drill…)
> 
> We have not yet looked into the security setup for DoY. (We wanted to get the security
fully working with Drill itself first.) You raise some good issues that we must wrestle with
as we enhance DoY to use the security features that are becoming available in Drill itself.
> 
> Thanks,
> 
> - Paul
> 
> 
>> On Jun 23, 2017, at 9:50 AM, John Omernik <john@omernik.com> wrote:
>> 
>> That makes sense, ya, I would love to hear about the challenges of this in
>> general from the Drill folks.
>> 
>> Also, I wonder if Paul R at MapR has any thoughts in how something like
>> this would be handled in the Drill on Yarn Setup.
>> 
>> 
>> John
>> 
> 

Mime
View raw message