spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ThuanHV@fsoft.com.vn (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore
Date Thu, 03 May 2018 23:11:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463003#comment-16463003
] 

ThuanHV@fsoft.com.vn edited comment on SPARK-13446 at 5/3/18 11:10 PM:
-----------------------------------------------------------------------

Hi Tavis and Spark Team,

Tavis explanation is very clear. As I read in the other thread, this bug is fixed and fully
tested by QA team. Could someone please help to look into the merging the code.

I am running into the same problem in pyspark.  Bellow my pyspark submit stack trace:

py4j.protocol.Py4JJavaError: An error occurred while calling o24.sessionState.

: java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT

at org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:200)

at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:265)

at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)

at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)

at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:194)

at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)

at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)

at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)

at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:193)

at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)

at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)

at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)

at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)

at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)

at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)

at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)

at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050)

at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)

at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)

at scala.Option.getOrElse(Option.scala:121)


was (Author: thuanhv@fsoft.com.vn):
I am running into the same problem in pyspark. Could anyone help resolve this issue, please? Bellow
my pyspark submit stack trace:

py4j.protocol.Py4JJavaError: An error occurred while calling o24.sessionState.

: java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT

 at org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:200)

 at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:265)

 at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)

 at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)

 at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:194)

 at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)

 at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)

 at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)

 at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:193)

 at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)

 at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)

 at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)

 at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)

 at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)

 at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)

 at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)

 at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050)

 at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)

 at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)

 at scala.Option.getOrElse(Option.scala:121)

> Spark need to support reading data from Hive 2.0.0 metastore
> ------------------------------------------------------------
>
>                 Key: SPARK-13446
>                 URL: https://issues.apache.org/jira/browse/SPARK-13446
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Lifeng Wang
>            Assignee: Xiao Li
>            Priority: Major
>             Fix For: 2.2.0
>
>
> Spark provided HIveContext class to read data from hive metastore directly. While it
only supports hive 1.2.1 version and older. Since hive 2.0.0 has released, it's better to
upgrade to support Hive 2.0.0.
> {noformat}
> 16/02/23 02:35:02 INFO metastore: Trying to connect to metastore with URI thrift://hsw-node13:9083
> 16/02/23 02:35:02 INFO metastore: Opened a connection to metastore, current connections:
1
> 16/02/23 02:35:02 INFO metastore: Connected to metastore.
> Exception in thread "main" java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
>         at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:473)
>         at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:192)
>         at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
>         at org.apache.spark.sql.hive.HiveContext$$anon$1.<init>(HiveContext.scala:422)
>         at org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:422)
>         at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:421)
>         at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:72)
>         at org.apache.spark.sql.SQLContext.table(SQLContext.scala:739)
>         at org.apache.spark.sql.SQLContext.table(SQLContext.scala:735)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message