spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Improving system design logging in spark
Date Wed, 20 Apr 2016 17:47:53 GMT
Interesting.

For #3:

bq. reading data from,

I guess you meant reading from disk.

On Wed, Apr 20, 2016 at 10:45 AM, atootoonchian <ali@levyx.com> wrote:

> Current spark logging mechanism can be improved by adding the following
> parameters. It will help in understanding system bottlenecks and provide
> useful guidelines for Spark application developer to design an optimized
> application.
>
> 1. Shuffle Read Local Time: Time for a task to read shuffle data from local
> storage.
> 2. Shuffle Read Remote Time: Time for a  task to read shuffle data from
> remote node.
> 3. Distribution processing time between computation, I/O, network: Show
> distribution of processing time of each task between computation, reading
> data from, and reading data from network.
> 4. Average I/O bandwidth: Average time of I/O throughput for each task when
> it fetches data from disk.
> 5. Average Network bandwidth: Average network throughput for each task when
> it fetches data from remote nodes.
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Improving-system-design-logging-in-spark-tp17291.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Mime
View raw message