hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Vary (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.
Date Fri, 26 Oct 2018 12:40:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665119#comment-16665119
] 

Peter Vary commented on HIVE-20682:
-----------------------------------

[~sankarh] [~maheshk114]: What will happen when the following sequence of events happen:
 * Session started - Hive object H1 is created with allowClose=false
 * Async compilation of a long compiling query Q1 is started (BackGroundWork is started) -
other thread got the H1 as a threadLocal Hive object too.
 * Set some metastore client configuration (set hive.metastore.client.socket.timeout=7000)
 * New query Q2 is issued - Hive.get(HiveConf) will detect that the 2 configurations are not
compatible
 ** Which triggers a new Hive creation  H2 
 *** The allowClose for H2 will be true
 *** This will be set as a ThreadLocal for the given thread - but not to the session level
(I think this is a problem since from now on every thread needs to create it's own Hive object
- but this is a different problem :))
 ** Also H1 is not closed since allowClose=false
 ** If I did not made any mistakes until this stage then "assert (!parentHive.allowClose());"
will fail too, so the query will not run.
 * Async compilation of Q1 is finished
 ** closeCurrent() is called - since allowClose is false the HMS connection is not closed

Basically we will close the HMS connections only for the session when the session is closed,
but will create multiple connections if the following is true:
{code:java}
if (db == null || !db.isCurrentUserOwner() || needsRefresh
	|| (c != null && !isCompatible(db, c, isFastCheck))) {{code}
What do you think?

Is there any mistake in my reasoning?

Thanks,

Peter

> Async query execution can potentially fail if shared sessionHive is closed by master
thread.
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-20682
>                 URL: https://issues.apache.org/jira/browse/HIVE-20682
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 3.1.0, 4.0.0
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, HIVE-20682.03.patch, HIVE-20682.04.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* class when we
open a new session for a client connection and by default all queries from this connection shares
the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the sessionHive object
(referred via thread local hiveDb) if  {{Hive.isCompatible}} returns false and sets new Hive
object in thread local HiveDb but doesn't change the sessionHive object in the session. Whereas,
*asynchronous* query execution via async threads never closes the sessionHive object and
it just creates a new one if needed and sets it as their thread local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being executed
by async threads refers to sessionHive object and the master thread receives a *synchronous*
query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to sessionHive object
which potentially leaks a metastore connection if the previous synchronous query execution
re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it shouldn't be allowed
to be closed by any query execution threads when they re-create the Hive object due to changes
in Hive configurations. But the Hive objects created by query execution threads should be
closed when the thread exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive object which
should be set to *false* for *sessionHive* and would be forcefully closed when the session
is closed or released.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message