spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "JP Bordenave (Jira)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore
Date Tue, 01 Oct 2019 20:15:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942275#comment-16942275
] 

JP Bordenave edited comment on SPARK-13446 at 10/1/19 8:14 PM:
---------------------------------------------------------------

1) ok i am back,  after some internet issue,  i make restore all hive 1.2.1  jars from
spark 2.4.4

2) hive v2.3.3 get conflict mysql schema 2.3.0 with spark 2.4.4 because it use 1.2.1 schema

3) i make cp hive-site.xml into spark/conf and i disable schema check,  it is working fine
under spark-shell

{{<property>}}

{\{ <name>hive.metastore.schema.verification</name>}}

{\{ <value>false</value>}}

{{</property>}}

{{But i doenst understand why spark 2.4.4 use old hive schema 1.2.1  ?(not realy clear for
me)  ??? and why i must disable it}}

 
{noformat}
       ^
scala> spark.sql("show databases").show
+------------+
|databaseName|
+------------+
|     default|
+------------+
scala> spark.sql("show tables").show
19/10/01 21:44:07 WARN metastore.ObjectStore: Failed to get database global_temp, returning
NoSuchObjectException
+--------+---------+-----------+
|database|tableName|isTemporary|
+--------+---------+-----------+
| default| employee|      false|
+--------+---------+-----------+
scala> spark.sql("select * from employee").show
+---+-----------+---------+
| id|       name|     dept|
+---+-----------+---------+
|  1|      Allen|       IT|
|  2|        Mag|    Sales|
| 14|     Pierre|      xXx|
|  1|      Allen|       IT|
|  3|        Rob|    Sales|
|  4|       Dana|       IT|
|  7|     Pierre|      xXx|
| 11|     Pierre|      xXx|
| 10|     Pierre|      xXx|
| 12|     Pierre|      xXx|
| 13|     Pierre|      xXx|
+---+-----------+---------+
 {noformat}
{{}}

{{}}

{{}}


was (Author: jpbordi):
1) ok i am back,  after some internet issue,  i make restore all hive 1.2.1  jars from
spark 2.4.4

2) hive v2.3.3 get conflict mysql schema 2.3.0 with spark 2.4.4 because it use 1.2.1 schema

3) i make cp hive-site.xml into spark/conf and i disable schema check,  it is working fine
under spark-shell

{{<property>}}

{\{ <name>hive.metastore.schema.verification</name>}}

{\{ <value>false</value>}}

{{</property>}}

{{But i doenst understand why spark 2.4.4 use old hive schema 1.2.1 (not realy clear for me)}}

{{}}
{noformat}
       ^
scala> spark.sql("show databases").show
+------------+
|databaseName|
+------------+
|     default|
+------------+
scala> spark.sql("show tables").show
19/10/01 21:44:07 WARN metastore.ObjectStore: Failed to get database global_temp, returning
NoSuchObjectException
+--------+---------+-----------+
|database|tableName|isTemporary|
+--------+---------+-----------+
| default| employee|      false|
+--------+---------+-----------+
scala> spark.sql("select * from employee").show
+---+-----------+---------+
| id|       name|     dept|
+---+-----------+---------+
|  1|      Allen|       IT|
|  2|        Mag|    Sales|
| 14|     Pierre|      xXx|
|  1|      Allen|       IT|
|  3|        Rob|    Sales|
|  4|       Dana|       IT|
|  7|     Pierre|      xXx|
| 11|     Pierre|      xXx|
| 10|     Pierre|      xXx|
| 12|     Pierre|      xXx|
| 13|     Pierre|      xXx|
+---+-----------+---------+
 {noformat}
{{}}

{{}}

{{}}

> Spark need to support reading data from Hive 2.0.0 metastore
> ------------------------------------------------------------
>
>                 Key: SPARK-13446
>                 URL: https://issues.apache.org/jira/browse/SPARK-13446
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Lifeng Wang
>            Assignee: Xiao Li
>            Priority: Major
>             Fix For: 2.2.0
>
>
> Spark provided HIveContext class to read data from hive metastore directly. While it
only supports hive 1.2.1 version and older. Since hive 2.0.0 has released, it's better to
upgrade to support Hive 2.0.0.
> {noformat}
> 16/02/23 02:35:02 INFO metastore: Trying to connect to metastore with URI thrift://hsw-node13:9083
> 16/02/23 02:35:02 INFO metastore: Opened a connection to metastore, current connections:
1
> 16/02/23 02:35:02 INFO metastore: Connected to metastore.
> Exception in thread "main" java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
>         at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:473)
>         at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:192)
>         at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
>         at org.apache.spark.sql.hive.HiveContext$$anon$1.<init>(HiveContext.scala:422)
>         at org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:422)
>         at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:421)
>         at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:72)
>         at org.apache.spark.sql.SQLContext.table(SQLContext.scala:739)
>         at org.apache.spark.sql.SQLContext.table(SQLContext.scala:735)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message