spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Simon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-4781) Column values become all NULL after doing ALTER TABLE CHANGE for renaming column names (Parquet external table in HiveContext)
Date Thu, 26 Apr 2018 11:14:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453861#comment-16453861
] 

Peter Simon commented on SPARK-4781:
------------------------------------

As commented under SPARK-11748, 

Possible workaround can be:
{code:java}
scala> spark.sql("set parquet.column.index.access=true").show
scala> spark.sql("set spark.sql.hive.convertMetastoreParquet=false").show
scala> spark.sql ("select * from test_parq").show
+----+------+
|a_ch| b|
+----+------+
| 1| a|
| 2|test 2|
+----+------+{code}

> Column values become all NULL after doing ALTER TABLE CHANGE for renaming column names
(Parquet external table in HiveContext)
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-4781
>                 URL: https://issues.apache.org/jira/browse/SPARK-4781
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.2.0, 1.2.1, 1.3.0
>            Reporter: Jianshi Huang
>            Priority: Major
>
> I have a table say created like follows:
> {code}
> CREATE EXTERNAL TABLE pmt (
>   `sorted::cre_ts` string
> )
> STORED AS PARQUET
> LOCATION '...'
> {code}
> And I renamed the column from sorted::cre_ts to cre_ts by doing:
> {code}
> ALTER TABLE pmt CHANGE `sorted::cre_ts` cre_ts string
> {code}
> After renaming the column, the values in the column become all NULLs.
> {noformat}
> Before renaming:
> scala> sql("select `sorted::cre_ts` from pmt limit 1").collect
> res12: Array[org.apache.spark.sql.Row] = Array([12/02/2014 07:38:54])
> Execute renaming:
> scala> sql("alter table pmt change `sorted::cre_ts` cre_ts string")
> res13: org.apache.spark.sql.SchemaRDD =
> SchemaRDD[972] at RDD at SchemaRDD.scala:108
> == Query Plan ==
> <Native command: executed by Hive>
> After renaming:
> scala> sql("select cre_ts from pmt limit 1").collect
> res16: Array[org.apache.spark.sql.Row] = Array([null])
> {noformat}
> Jianshi



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message