spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Wu (Jira)" <>
Subject [jira] [Commented] (SPARK-20427) Issue with Spark interpreting Oracle datatype NUMBER
Date Tue, 01 Oct 2019 15:49:00 GMT


Paul Wu commented on SPARK-20427:

Some one asked me this problem months ago and I found a solution for him , but I forgot the
solution when another one in my team asked me again yesterday. I had to spend several hours
on this since her query was quite complex.  For a record and my own reference, I would like
to put the solution  here (inspired by [~sobusiak]  and [~yumwang] ):  Add the customSchema
option after the read() that specifies all potential trouble makers as Double types.  It
can probably resolve most cases in real applications.  Surely, this is supposed one does
not particularly concern about the exact significant digits in his/her applications. 

.option("customSchema", "col1 Double, col2 Double") //where col1, col2... are columns that
could cause the trouble.


> Issue with Spark interpreting Oracle datatype NUMBER
> ----------------------------------------------------
>                 Key: SPARK-20427
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Alexander Andrushenko
>            Assignee: Yuming Wang
>            Priority: Major
>             Fix For: 2.3.0
> In Oracle exists data type NUMBER. When defining a filed in a table of type NUMBER the
field has two components, precision and scale.
> For example, NUMBER(p,s) has precision p and scale s. 
> Precision can range from 1 to 38.
> Scale can range from -84 to 127.
> When reading such a filed Spark can create numbers with precision exceeding 38. In our
case it has created fields with precision 44,
> calculated as sum of the precision (in our case 34 digits) and the scale (10):
> " requirement failed: Decimal precision 44 exceeds
max precision 38...".
> The result was, that a data frame was read from a table on one schema but could not be
inserted in the identical table on other schema.

This message was sent by Atlassian Jira

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message