spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Osipov (JIRA)" <>
Subject [jira] [Created] (SPARK-5576) saveAsTable into Hive fails due to duplicate columns
Date Tue, 03 Feb 2015 22:52:44 GMT
Dan Osipov created SPARK-5576:

             Summary: saveAsTable into Hive fails due to duplicate columns
                 Key: SPARK-5576
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.2.0
            Reporter: Dan Osipov

Loading JSON files infers case sensitive schema, which results in an error if attempting to
save to Hive.

import org.apache.spark.sql._
import org.apache.spark.sql.hive._
val hive = new HiveContext(sc)
val data = hive.jsonFile("/path/")

Results in an error:
org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException:
Duplicate column name data-errorcode in the table definition.

Outputting the schema shows the problem field:
 |-- data-errorCode: string (nullable = true)
 |-- data-errorcode: string (nullable = true)

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message