spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shixiong Zhu (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-24351) offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode
Date Fri, 01 Jun 2018 17:49:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shixiong Zhu resolved SPARK-24351.
----------------------------------
       Resolution: Fixed
    Fix Version/s: 2.4.0

Issue resolved by pull request 21400
[https://github.com/apache/spark/pull/21400]

> offsetLog/commitLog purge thresholdBatchId should be computed with current committed
epoch but not currentBatchId in CP mode
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-24351
>                 URL: https://issues.apache.org/jira/browse/SPARK-24351
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.3.0
>            Reporter: huangtengfei
>            Assignee: huangtengfei
>            Priority: Major
>             Fix For: 2.4.0
>
>
> In structured streaming, there is a conf spark.sql.streaming.minBatchesToRetain which
is set to specify 'The minimum number of batches that must be retained and made recoverable'
as described in [SQLConf|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L802].
In continuous processing, the metadata purge is triggered when an epoch is committed in [ContinuousExecution|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousExecution.scala#L306].

>  Since currentBatchId increases independently in cp mode, the current committed epoch
may be far behind currentBatchId if some task hangs for some time. It is not safe to discard
the metadata with thresholdBatchId computed based on currentBatchId because we may clean all
the metadata in the checkpoint directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message