spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wim Van Leuven <wim.vanleu...@highestpoint.biz>
Subject Re:
Date Mon, 02 Mar 2020 06:21:18 GMT
Hey Hamish,

I don't think there is 'automatic fix' for this problem ...
Are you reading those as partitions of a single dataset? Or are you
processing them individually?

As apparently, your incoming data is not stable, you should implement a
preprocessing step on each file to check and, if necessary,  to align the
faulty datasets with what you expect.

HTH
-wim

Mime
View raw message