spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Osipov (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-5576) saveAsTable into Hive fails due to duplicate columns
Date Tue, 03 Feb 2015 22:52:44 GMT
Dan Osipov created SPARK-5576:
---------------------------------

             Summary: saveAsTable into Hive fails due to duplicate columns
                 Key: SPARK-5576
                 URL: https://issues.apache.org/jira/browse/SPARK-5576
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.2.0
            Reporter: Dan Osipov


Loading JSON files infers case sensitive schema, which results in an error if attempting to
save to Hive.

{code}
import org.apache.spark.sql._
import org.apache.spark.sql.hive._
val hive = new HiveContext(sc)
val data = hive.jsonFile("/path/")
data.saveAsTable("table")
{code}

Results in an error:
org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException:
Duplicate column name data-errorcode in the table definition.

Outputting the schema shows the problem field:
 |-- data-errorCode: string (nullable = true)
 |-- data-errorcode: string (nullable = true)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message