spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ankit Raj Boudh (Jira)" <>
Subject [jira] [Commented] (SPARK-30108) Add robust accumulator for observable metrics
Date Sat, 21 Dec 2019 04:07:00 GMT


Ankit Raj Boudh commented on SPARK-30108:

[~hvanhovell], Thank you, during development of this feature i will take care of this point.

> Add robust accumulator for observable metrics
> ---------------------------------------------
>                 Key: SPARK-30108
>                 URL:
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Herman van Hövell
>            Priority: Major
> Spark accumulators reflect the work that has been done, and not the data that has been
processed. There are situations where one tuple can be processed multiple times, e.g.: task/stage
retries, speculation, determination of ranges for global ordered, etc... For observed metrics
we need the value of the accumulator to be based on the data and not on processing.
> The current aggregating accumulator is already robust to some of these issues (like
task failure), but we need to add some additional checks to make sure it is fool proof.

This message was sent by Atlassian Jira

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message