spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ReeceRobinson <>
Subject Do I really need to build Spark for Hive/Thrift Server support?
Date Mon, 27 Jul 2015 22:56:28 GMT
I'm a bit confused about the documentation in the area of Hive support.

I want to use a remote Hive metastore/hdfs server and the documentation says
that we need to build Spark from source due to the large number of
dependencies Hive requires.

Specifically the documentation says:

"Hive has a large number of dependencies, it is not included in the default
Spark assembly....This command builds a new assembly jar that includes

So I downloaded the source distribution of Spark 1.4.1 and executed the
following build command:

./ --name spark-1.4.1-hadoop-2.6-hive --tgz -Pyarn
-Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver  -DskipTests

Inspecting the size of the resulting spark-assembly-1.4.1-hadoop2.6.0.jar it
is only a few bytes different ie. Pre-built jar is 162976273 bytes and my
custom built jar is 162976444. I don't see any new hive jar file either?

Can someone please help me understand what is going on here?


View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message