spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony May <anthony...@gmail.com>
Subject Re: Spark MySQL Invalid DateTime value killing job
Date Thu, 06 Jun 2019 01:10:08 GMT
Murphy's Law striking after asking the question, I just discovered the
solution:
The jdbc url should set the zeroDateTimeBehavior option.
https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-configuration-properties.html
https://stackoverflow.com/questions/11133759/0000-00-00-000000-can-not-be-represented-as-java-sql-timestamp-error

On Wed, Jun 5, 2019 at 6:29 PM Anthony May <anthonymay@gmail.com> wrote:

> Hi,
>
> We have a legacy process of scraping a MySQL Database. The Spark job uses
> the DataFrame API and MySQL JDBC driver to read the tables and save them as
> JSON files. One table has DateTime columns that contain values invalid for
> java.sql.Timestamp so it's throwing the exception:
> java.sql.SQLException: Value '0000-00-00 00:00:00' can not be represented
> as java.sql.Timestamp
>
> Unfortunately, I can't edit the values in the table to make them valid.
> There doesn't seem to be a way to specify row level exception handling in
> the DataFrame API. Is there a way to handle this that would scale for
> hundreds of tables?
>
> Any help is appreciated.
>
> Anthony
>

Mime
View raw message