spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shixiong(Ryan) Zhu" <>
Subject Re: Structured Streaming + Kafka - Corrupted Checkpoint Offsets / Commits
Date Thu, 04 Jan 2018 21:53:49 GMT
The root cause is probably that HDFSMetadataLog ignores exceptions thrown
by "output.close". I think this should be fixed by this line in Spark 2.2.1
and 3.0.0:

Could you try 2.2.1?

On Thu, Jan 4, 2018 at 9:08 AM, William Briggs <> wrote:

> I am running a Structured Streaming job (Spark 2.2.0) using EMR 5.9. The
> job sources data from a Kafka topic, performs a variety of filters and
> transformations, and sinks data back into a different Kafka topic.
> Once per day, we stop the query in order to merge the namenode edit logs
> with the fsimage, because Structured Streaming creates and destroys a
> significant number of HDFS files, and EMR doesn't support a secondary or HA
> namenode for fsimage compaction (AWS support directed us to do this, as
> Namenode edit logs were filling the disk).
> Occasionally, the Structured Streaming query will not restart because the
> most recent file in the "commits" or "offsets" checkpoint subdirectory is
> empty. This seems like an undesirable behavior, as it requires manual
> intervention to remove the empty files in order to force the job to fall
> back onto the last good values. Has anyone run into this behavior? The only
> similar issue I can find is SPARK-21760
> <>, which appears to
> have no fix or workaround.
> Any assistance would be greatly appreciated!
> Regards,
> Will

View raw message