spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix C <felixcheun...@hotmail.com>
Subject Re: Executing hive query from Spark code
Date Tue, 03 Mar 2015 04:59:09 GMT
It should work in CDH without having to recompile.

http://eradiating.wordpress.com/2015/02/22/getting-hivecontext-to-work-in-cdh/

--- Original Message ---

From: "Ted Yu" <yuzhihong@gmail.com>
Sent: March 2, 2015 1:35 PM
To: "nitinkak001" <nitinkak001@gmail.com>
Cc: "user" <user@spark.apache.org>
Subject: Re: Executing hive query from Spark code

Here is snippet of dependency tree for spark-hive module:

[INFO] org.apache.spark:spark-hive_2.10:jar:1.3.0-SNAPSHOT
...
[INFO] +- org.spark-project.hive:hive-metastore:jar:0.13.1a:compile
[INFO] |  +- org.spark-project.hive:hive-shims:jar:0.13.1a:compile
[INFO] |  |  +-
org.spark-project.hive.shims:hive-shims-common:jar:0.13.1a:compile
[INFO] |  |  +-
org.spark-project.hive.shims:hive-shims-0.20:jar:0.13.1a:runtime
[INFO] |  |  +-
org.spark-project.hive.shims:hive-shims-common-secure:jar:0.13.1a:compile
[INFO] |  |  +-
org.spark-project.hive.shims:hive-shims-0.20S:jar:0.13.1a:runtime
[INFO] |  |  \-
org.spark-project.hive.shims:hive-shims-0.23:jar:0.13.1a:runtime
...
[INFO] +- org.spark-project.hive:hive-exec:jar:0.13.1a:compile
[INFO] |  +- org.spark-project.hive:hive-ant:jar:0.13.1a:compile
[INFO] |  |  \- org.apache.velocity:velocity:jar:1.5:compile
[INFO] |  |     \- oro:oro:jar:2.0.8:compile
[INFO] |  +- org.spark-project.hive:hive-common:jar:0.13.1a:compile
...
[INFO] +- org.spark-project.hive:hive-serde:jar:0.13.1a:compile

bq. is there a way to have the hive support without updating the assembly

I don't think so.

On Mon, Mar 2, 2015 at 12:37 PM, nitinkak001 <nitinkak001@gmail.com> wrote:

> I want to run Hive query inside Spark and use the RDDs generated from that
> inside Spark. I read in the documentation
>
> "/Hive support is enabled by adding the -Phive and -Phive-thriftserver
> flags
> to Spark’s build. This command builds a new assembly jar that includes
> Hive.
> Note that this Hive assembly jar must also be present on all of the worker
> nodes, as they will need access to the Hive serialization and
> deserialization libraries (SerDes) in order to access data stored in
> Hive./"
>
> I just wanted to know what -Phive and -Phive-thriftserver flags really do
> and is there a way to have the hive support without updating the assembly.
> Does that flag add a hive support jar or something?
>
> The reason I am asking is that I will be using Cloudera version of Spark in
> future and I am not sure how to add the Hive support to that Spark
> distribution.
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Executing-hive-query-from-Spark-code-tp21880.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message