spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony May <>
Subject Spark MySQL Invalid DateTime value killing job
Date Thu, 06 Jun 2019 00:29:19 GMT

We have a legacy process of scraping a MySQL Database. The Spark job uses
the DataFrame API and MySQL JDBC driver to read the tables and save them as
JSON files. One table has DateTime columns that contain values invalid for
java.sql.Timestamp so it's throwing the exception:
java.sql.SQLException: Value '0000-00-00 00:00:00' can not be represented
as java.sql.Timestamp

Unfortunately, I can't edit the values in the table to make them valid.
There doesn't seem to be a way to specify row level exception handling in
the DataFrame API. Is there a way to handle this that would scale for
hundreds of tables?

Any help is appreciated.


View raw message