spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <>
Subject [jira] [Commented] (SPARK-10620) Look into whether accumulator mechanism can replace TaskMetrics
Date Mon, 18 Jan 2016 18:25:39 GMT


Apache Spark commented on SPARK-10620:

User 'andrewor14' has created a pull request for this issue:

> Look into whether accumulator mechanism can replace TaskMetrics
> ---------------------------------------------------------------
>                 Key: SPARK-10620
>                 URL:
>             Project: Spark
>          Issue Type: Task
>          Components: Spark Core
>            Reporter: Patrick Wendell
> This task is simply to explore whether the internal representation used by TaskMetrics
could be performed by using accumulators rather than having two separate mechanisms. Note
that we need to continue to preserve the existing "Task Metric" data structures that are exposed
to users through event logs etc. The question is can we use a single internal codepath and
perhaps make this easier to extend in the future.
> I think a full exploration would answer the following questions:
> - How do the semantics of accumulators on stage retries differ from aggregate TaskMetrics
for a stage? Could we implement clearer retry semantics for internal accumulators to allow
them to be the same - for instance, zeroing accumulator values if a stage is retried (see
discussion here: SPARK-10042).
> - Are there metrics that do not fit well into the accumulator model, or would be difficult
to update as an accumulator.
> - If we expose metrics through accumulators in the future rather than continuing to add
fields to TaskMetrics, what is the best way to coerce compatibility?
> - Are there any other considerations?
> - Is it worth it to do this, or is the consolidation too complicated to justify?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message