spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gengliang <ltn...@gmail.com>
Subject Re: Honor ParseMode in AvroFileFormat
Date Fri, 08 Mar 2019 05:17:24 GMT
Hi Tim,

I think you can try setting the option *spark.sql.files.ignoreCorruptFiles *as
*true*. With the option enabled, the Spark jobs will continue to run
when encountering corrupted files and the contents that have been read will
still be returned.
The CSV/JSON data source supports the Permissive modes in reading files
because it is possible that users still want partial row results.
When reading corrupted Avro files, I think skipping the rest of files is
enough if users want to ignore them.
For processing data with function `from_avro`, I have created a PR to
support  *PERMISSIVE*/*FAILFAST* mode:
https://github.com/apache/spark/pull/22814

Gengliang


On Fri, Mar 8, 2019 at 6:25 AM tim <tim@gh.st> wrote:

> /facepalm
>
> Here we go: https://issues.apache.org/jira/browse/SPARK-27093
>
> Tim
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Mime
View raw message