hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nemon Lou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12538) After set spark related config, SparkSession never get reused
Date Sat, 28 Nov 2015 06:28:10 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15030405#comment-15030405
] 

Nemon Lou commented on HIVE-12538:
----------------------------------

After debugging ,i find the problem is that ,the operation conf object SparkUtilities used
to detect configuration change is different from session conf.
And the session conf object 's getSparkConfigUpdated method always return true after setting
spark related config.
The code path where SQLOperation copy a new conf object from session conf:
https://github.com/apache/hive/blob/spark/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L467
{code}
/**
   * If there are query specific settings to overlay, then create a copy of config
   * There are two cases we need to clone the session config that's being passed to hive driver
   * 1. Async query -
   *    If the client changes a config setting, that shouldn't reflect in the execution already
underway
   * 2. confOverlay -
   *    The query specific settings should only be applied to the query config and not session
   * @return new configuration
   * @throws HiveSQLException
   */
  private HiveConf getConfigForOperation() throws HiveSQLException {
    HiveConf sqlOperationConf = getParentSession().getHiveConf();
    if (!getConfOverlay().isEmpty() || shouldRunAsync()) {
      // clone the partent session config for this query
      sqlOperationConf = new HiveConf(sqlOperationConf);

      // apply overlay query specific settings, if any
      for (Map.Entry<String, String> confEntry : getConfOverlay().entrySet()) {
        try {
          sqlOperationConf.verifyAndSet(confEntry.getKey(), confEntry.getValue());
        } catch (IllegalArgumentException e) {
          throw new HiveSQLException("Error applying statement specific settings", e);
        }
      }
    }
    return sqlOperationConf;
  }
{code}
The code path where SparkUtilities detect the change and close the spark session :
https://github.com/apache/hive/blob/spark/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java#L122
{code}
public static SparkSession getSparkSession(HiveConf conf,
      SparkSessionManager sparkSessionManager) throws HiveException {
    SparkSession sparkSession = SessionState.get().getSparkSession();

    // Spark configurations are updated close the existing session
    if (conf.getSparkConfigUpdated()) {
      sparkSessionManager.closeSession(sparkSession);
      sparkSession =  null;
      conf.setSparkConfigUpdated(false);
    }
    sparkSession = sparkSessionManager.getSession(sparkSession, conf, true);
    SessionState.get().setSparkSession(sparkSession);
    return sparkSession;
  }
{code}

It shoud be easy to reproduce, i will dig more.



> After set spark related config, SparkSession never get reused
> -------------------------------------------------------------
>
>                 Key: HIVE-12538
>                 URL: https://issues.apache.org/jira/browse/HIVE-12538
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>    Affects Versions: 1.3.0
>            Reporter: Nemon Lou
>
> Hive on Spark yarn-cluster mode.
> After setting "set spark.yarn.queue=QueueA;" ,
> run the query "select count(*) from test"  3 times and you will find  3 different yarn
applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message