hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <>
Subject [jira] [Commented] (HIVE-15302) Relax the requirement that HoS needs Spark built w/o Hive
Date Thu, 01 Dec 2016 05:13:58 GMT


Xuefu Zhang commented on HIVE-15302:

I think there are two dependency on Spark from Hive

  1. Spark runtime classes which used to be in spark-assembly.jar
  2. script that is used to submit spark application for a hive session.

For #1, I think spark.yarn.jars or spark.yarn.archive will do. 
For #2, I think we still need SPARK_HOME unless we clone a simplified spark installation in
Hive directory structure, which is not ideal.

Thus, SPARK_HOME seems still required. If so, Hive can automatically figure out spark.yarn.jars
or spark.yarn.archive if it's not already set, from SPARK_HOME. To speed file distribution,
an admin can point any of this properties to an HDFS location, which requires admin manually
upload files to HDFS beforehand.

As to spark.yarn.archive, I think one needs to zip all the jars, not the folder that contains
the jar. However, I didn't try and verify this.

> Relax the requirement that HoS needs Spark built w/o Hive
> ---------------------------------------------------------
>                 Key: HIVE-15302
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Rui Li
>            Assignee: Rui Li
> This requirement becomes more and more unacceptable as SparkSQL becomes widely adopted.
Let's use this JIRA to find out how we can relax the limitation.

This message was sent by Atlassian JIRA

View raw message