spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rama Mullapudi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-9734) java.lang.IllegalArgumentException: Don't know how to save StructField(sal,DecimalType(7,2),true) to JDBC
Date Fri, 14 Aug 2015 13:36:45 GMT

    [ https://issues.apache.org/jira/browse/SPARK-9734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697003#comment-14697003
] 

Rama Mullapudi commented on SPARK-9734:
---------------------------------------

Current 1.5 still gives error when creating table using dataframe write.jdbc

Create statement issued by spark looks as below
CREATE TABLE foo (TKT_GID DECIMAL(10},0}) NOT NULL)

There are closing braces } in the decimal format which causing database to throw error.

I looked into the code on github and found in jdbcutils class schemaString function has the
extra closing braces } which is causing the issue.


  /**
   * Compute the schema string for this RDD.
   */
  def schemaString(df: DataFrame, url: String): String = {
 .....
            case BooleanType => "BIT(1)"
            case StringType => "TEXT"
            case BinaryType => "BLOB"
            case TimestampType => "TIMESTAMP"
            case DateType => "DATE"
            case t: DecimalType => s"DECIMAL(${t.precision}},${t.scale}})"
            case _ => throw new IllegalArgumentException(s"Don't know how to save $field
to JDBC")
          })

  }

> java.lang.IllegalArgumentException: Don't know how to save StructField(sal,DecimalType(7,2),true)
to JDBC
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-9734
>                 URL: https://issues.apache.org/jira/browse/SPARK-9734
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.1
>            Reporter: Greg Rahn
>            Assignee: Davies Liu
>             Fix For: 1.5.0
>
>
> When using a basic example of reading the EMP table from Redshift via spark-redshift,
and writing the data back to Redshift, Spark fails with the below error, related to Numeric/Decimal
data types.
> Redshift table:
> {code}
> testdb=# \d emp
>               Table "public.emp"
>   Column  |         Type          | Modifiers
> ----------+-----------------------+-----------
>  empno    | integer               |
>  ename    | character varying(10) |
>  job      | character varying(9)  |
>  mgr      | integer               |
>  hiredate | date                  |
>  sal      | numeric(7,2)          |
>  comm     | numeric(7,2)          |
>  deptno   | integer               |
> testdb=# select * from emp;
>  empno | ename  |    job    | mgr  |  hiredate  |   sal   |  comm   | deptno
> -------+--------+-----------+------+------------+---------+---------+--------
>   7369 | SMITH  | CLERK     | 7902 | 1980-12-17 |  800.00 |    NULL |     20
>   7521 | WARD   | SALESMAN  | 7698 | 1981-02-22 | 1250.00 |  500.00 |     30
>   7654 | MARTIN | SALESMAN  | 7698 | 1981-09-28 | 1250.00 | 1400.00 |     30
>   7782 | CLARK  | MANAGER   | 7839 | 1981-06-09 | 2450.00 |    NULL |     10
>   7839 | KING   | PRESIDENT | NULL | 1981-11-17 | 5000.00 |    NULL |     10
>   7876 | ADAMS  | CLERK     | 7788 | 1983-01-12 | 1100.00 |    NULL |     20
>   7902 | FORD   | ANALYST   | 7566 | 1981-12-03 | 3000.00 |    NULL |     20
>   7499 | ALLEN  | SALESMAN  | 7698 | 1981-02-20 | 1600.00 |  300.00 |     30
>   7566 | JONES  | MANAGER   | 7839 | 1981-04-02 | 2975.00 |    NULL |     20
>   7698 | BLAKE  | MANAGER   | 7839 | 1981-05-01 | 2850.00 |    NULL |     30
>   7788 | SCOTT  | ANALYST   | 7566 | 1982-12-09 | 3000.00 |    NULL |     20
>   7844 | TURNER | SALESMAN  | 7698 | 1981-09-08 | 1500.00 |    0.00 |     30
>   7900 | JAMES  | CLERK     | 7698 | 1981-12-03 |  950.00 |    NULL |     30
>   7934 | MILLER | CLERK     | 7782 | 1982-01-23 | 1300.00 |    NULL |     10
> (14 rows)
> {code}
> Spark Code:
> {code}
> val url = "jdbc:redshift://rshost:5439/testdb?user=xxx&password=xxx"
> val driver = "com.amazon.redshift.jdbc41.Driver"
> val t = sqlContext.read.format("com.databricks.spark.redshift").option("jdbcdriver",
driver).option("url", url).option("dbtable", "emp").option("tempdir", "s3n://spark-temp-dir").load()
> t.registerTempTable("SparkTempTable")
> val t1 = sqlContext.sql("select * from SparkTempTable")
> t1.write.format("com.databricks.spark.redshift").option("driver", driver).option("url",
url).option("dbtable", "t1").option("tempdir", "s3n://spark-temp-dir").option("avrocompression",
"snappy").mode("error").save()
> {code}
> Error Stack:
> {code}
> java.lang.IllegalArgumentException: Don't know how to save StructField(sal,DecimalType(7,2),true)
to JDBC
> 	at org.apache.spark.sql.jdbc.package$JDBCWriteDetails$$anonfun$schemaString$1$$anonfun$2.apply(jdbc.scala:149)
> 	at org.apache.spark.sql.jdbc.package$JDBCWriteDetails$$anonfun$schemaString$1$$anonfun$2.apply(jdbc.scala:136)
> 	at scala.Option.getOrElse(Option.scala:120)
> 	at org.apache.spark.sql.jdbc.package$JDBCWriteDetails$$anonfun$schemaString$1.apply(jdbc.scala:135)
> 	at org.apache.spark.sql.jdbc.package$JDBCWriteDetails$$anonfun$schemaString$1.apply(jdbc.scala:132)
> 	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> 	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
> 	at org.apache.spark.sql.jdbc.package$JDBCWriteDetails$.schemaString(jdbc.scala:132)
> 	at org.apache.spark.sql.jdbc.JDBCWrapper.schemaString(RedshiftJDBCWrapper.scala:28)
> 	at com.databricks.spark.redshift.RedshiftWriter.createTableSql(RedshiftWriter.scala:39)
> 	at com.databricks.spark.redshift.RedshiftWriter.doRedshiftLoad(RedshiftWriter.scala:105)
> 	at com.databricks.spark.redshift.RedshiftWriter.saveToRedshift(RedshiftWriter.scala:145)
> 	at com.databricks.spark.redshift.DefaultSource.createRelation(DefaultSource.scala:92)
> 	at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:309)
> 	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:144)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message