spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheng, Hao" <>
Subject RE: SparkSQL errors in 1.4 rc when using with Hive 0.12 metastore
Date Mon, 25 May 2015 03:18:17 GMT
Thanks for reporting this.

We intend to support the multiple metastore versions in a single build(hive-0.13.1) by introducing
the IsolatedClientLoader, but probably you’re hitting the bug, please file a jira issue
for this.

I will keep investigating on this also.


From: Mark Hamstra []
Sent: Sunday, May 24, 2015 9:06 PM
To: Cheolsoo Park
Subject: Re: SparkSQL errors in 1.4 rc when using with Hive 0.12 metastore

This discussion belongs on the dev list.  Please post any replies there.

On Sat, May 23, 2015 at 10:19 PM, Cheolsoo Park <<>>

I've been testing SparkSQL in 1.4 rc and found two issues. I wanted to confirm whether these
are bugs or not before opening a jira.

1) I can no longer compile SparkSQL with -Phive-0.12.0. I noticed that in 1.4, IsolatedClientLoader
is introduced, and different versions of Hive metastore jars can be loaded at runtime. But
instead, SparkSQL no longer compiles with Hive 0.12.0.

My question is, is this intended? If so, shouldn't the hive-0.12.0 profile in POM be removed?

2) After compiling SparkSQL with -Phive-0.13.1, I ran into my 2nd problem. Since I have Hive
0.12 metastore in production, I have to use it for now. But even if I set "spark.sql.hive.metastore.version"
and "spark.sql.hive.metastore.jars", SparkSQL cli throws an error as follows-

15/05/24 05:03:29 WARN RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting
to reconnect.
org.apache.thrift.TApplicationException: Invalid method name: 'get_functions'
at org.apache.thrift.TServiceClient.receiveBase(
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_functions(
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_functions(
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunctions(
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(
at com.sun.proxy.$Proxy12.getFunctions(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getFunctions(
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(
at org.apache.hadoop.hive.cli.CliDriver.getCommandCompletor(
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:175)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)

What's happening is that when SparkSQL Cli starts up, it tries to fetch permanent udfs from
Hive metastore (due to HIVE-6330<>, which
was introduced in Hive 0.13). But then, it ends up invoking an incompatible thrift function
that doesn't exist in Hive 0.12. To work around this error, I have to comment out the following
line of code for now-

My question is, is SparkSQL that is compiled against Hive 0.13 supposed to work with Hive
0.12 metastore (by setting "spark.sql.hive.metastore.version" and "spark.sql.hive.metastore.jars")?
It only works if I comment out the above line of code.


View raw message