spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Zhang <>
Subject java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT on EMR
Date Mon, 18 Mar 2019 15:46:41 GMT

I know the JIRA of this error (, and I read
all the comments and even PR for it.

But I am facing this issue on AWS EMR, and only in Oozie Spark Action. I am looking for someone
can give me a hint or direction,  so I can see if I can overcome this issue on EMR.

I am testing a simple Spark application on EMR-5.12.2, which comes with Hadoop 2.8.3 + HCatalog
2.3.2 + Spark 2.2.1, and using AWS Glue Data Catalog for both Hive + Spark table metadata.

First of all, both Hive and Spark work fine with AWS Glue as metadata catalog. And my spark
application works in spark-submit.

[hadoop@ip-172-31-65-232 oozieJobs]$ spark-shell
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.1

Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_171)
Type in expressions to have them evaluated.
Type :help for more information.
scala> spark.sql("show databases").show
|   databaseName|
|        default|
|       sampledb|

I can access and query the database I created in Glue without any issue on spark-shell or
And as part of later problem, I can see when it works in this case, there is no set of "spark.sql.hive.metastore.version"
in spark-shell, as the default value is shown below:

scala> spark.conf.get("spark.sql.hive.metastore.version")
res2: String = 1.2.1

Even though it shows version as "1.2.1", but I knew that by using Glue the hive metastore
version will be "2.3.2", I can see "hive-metastore-2.3.2-amzn-1.jar" in the Hive library path.

Now here comes the issue, when I test the Spark code in the Oozie Spark action, and "enableHiveSupport"
on the Spark session, it works with spark-submit in the command line, but failed with the
following error in the oozie runtime:

ailing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw
java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
        at org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:200)
        at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:265)
        at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)
        at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)
        at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:195)
        at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:195)
        at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:195)

I know this most likely caused by the Oozie runtime classpath, but I spent days of trying
and still cannot find out a solution. We use Spark as our core of ETL engine, and the ability
to manage and query the HiveCatalog is critical for us.

Here are what puzzled me:

  *   I know this issue was supposed fixing in Spark 2.2.0, and on this ERM, we are using
Spark 2.2.1
  *   There is 1.2.1 version of hive metastore jar under the spark jars on EMR. Does this
mean in the successful spark-shell runtime, spark indeed is using 1.2.1 version of hive-metastore?

[hadoop@ip-172-31-65-232 oozieJobs]$ ls /usr/lib/spark/jars/*hive-meta*

  *   There is 2.3.2 version of hive metastore jar under the Hive component on this EMR, which
I believe it pointing to the Glue, right?

[hadoop@ip-172-31-65-232 oozieJobs]$ ls /usr/lib/hive/lib/*hive-meta*
/usr/lib/hive/lib/hive-metastore-2.3.2-amzn-1.jar  /usr/lib/hive/lib/hive-metastore.jar

  *   I specified the "oozie.action.sharelib.for.spark=spark,hive" in the oozie, and I can
see oozie runtime loads the jars from both spark and hive share libs. There is NO hive-metastore-1.2.1-spark2-amzn-0.jar
in the oozie SPARK sharelib, and there is indeed hive-metastore-2.3.2-amzn-1.jar in the oozie
HIVE sharelib.
  *   Based on my understanding of (, here
are what I did so far trying to fix this in oozie runtime, but none of them works
     *   I added hive-metastore-1.2.1-spark2-amzn-0.jar into hdfs of ozzie spark share lib,
and run "oozie admin -sharelibupdate".  After that, I confirm this library loaded in the oozie
runtime log of my spark action, but I got the same error message.
     *   I added "--conf spark.sql.hive.metastore.version=2.3.2" in the <spark-opts>
of my oozie spark action, and confirm this configuration in spark session, but I still got
the same error message above.
     *   I added "--conf spark.sql.hive.metastore.version=2.3.2 --conf spark.sql.hive.metastore.jars=maven",
but still got the same error message
     *   I added "--conf spark.sql.hive.metastore.version=2.3.2 --conf spark.sql.hive.metastore.jars=/etc/spark/conf/hive-site.xml,/usr/lib/spark/jars/*"
in oozie spark action, but got the same error message
     *   I added "--conf spark.sql.hive.metastore.version=2.3.2 --conf hive.metastore.uris=thrift://ip-172-31-65-232.ec2.internal:9083
--conf spark.sql.hive.metastore.jars=/etc/spark/conf/hive-site.xml,/usr/lib/spark/jars/*"
in the oozie spark action, but got the same error.

I run out of options to try, and I really have no idea what is missing in the oozie runtime
causing this error in the Spark.

Let me know if you have any idea.



View raw message