hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aihua Xu (JIRA)" <>
Subject [jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2
Date Wed, 30 Mar 2016 21:02:25 GMT


Aihua Xu commented on HIVE-13149:

[~szehon], [], [~ashutoshc] Can you guys take another look at the latest patch to
see if it makes sense? 

The change includes: 
1. we will not get a HMS connection upfront for the new threads since most of the cases (like
MR tasks) won't use them. Connections will be acquired as necessary for StatsTask, e.g. 
2. we keep a copy of HiveConf in HMSClient rather than letting the caller decide make a copy
of HiveConf or not and calling get(conf).getMSC() [ since seems the callers typically would
make some HMS config changes to conf but forgot to make a copy of conf, then getMSC() call
would still get the old connection which wouldn't propagate HMS changes to HMS].  

The change should reduce the connection overhead and possible leaking.

Seems our tests are not stable. But those don't look related after several rerun. 

> Remove some unnecessary HMS connections from HS2 
> -------------------------------------------------
>                 Key: HIVE-13149
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: HiveServer2
>    Affects Versions: 2.0.0
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>         Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, HIVE-13149.3.patch, HIVE-13149.4.patch,
HIVE-13149.5.patch, HIVE-13149.6.patch
> In SessionState class, currently we will always try to get a HMS connection in {{start(SessionState
startSs, boolean isAsync, LogHelper console)}} regardless of if the connection will be used
later or not. 
> When SessionState is accessed by the tasks in, although most of the tasks
other than some like StatsTask, don't need to access HMS. Currently a new HMS connection will
be established for each Task thread. If HiveServer2 is configured to run in parallel and the
query involves many tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
>     runner = Thread.currentThread();
>     try {
>       OperationLog.setCurrentOperationLog(operationLog);
>       SessionState.start(ss);
>       runSequential();
> {noformat}

This message was sent by Atlassian JIRA

View raw message