spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <t...@databricks.com>
Subject Re: spark streaming performance
Date Thu, 09 Jul 2015 10:46:11 GMT
I think I know what is going on. What is the storage level that you are
using for your receivers? Are you using replication in the storage level?

Whats going on here is the following. There is 1 task with process_level
locality that is immediately launched as soon as the stage is submitted.
After that the somehow the system is not able to launch any process local
tasks, so has to wait for the spark.locality.wait = 3 seconds, to allow the
lower levels of locality, which is rack local. Then it launches rest of 49
tasks, which finish within 1 second in total. Overall, its taking 3 second
wait + 1 second + something = approx 5 seconds.

Increasing the number of cores per container will definitely increase
performance.

TD

On Thu, Jul 9, 2015 at 2:09 AM, Michel Hubert <michelh@vsnsystemen.nl>
wrote:

>  What do you mean with “full task list of all the 50 tasks”?
>
>
>
> This?
>  Details for Stage 271
>
> ·         *Total task time across all tasks:* 0.7 s
>
> ·         *Shuffle write:* 1377.0 B / 51
>
>  *Show additional metrics*
>  Summary Metrics for 51 Completed Tasks
>
> *Metric*
>
> *Min*
>
> *25th percentile*
>
> *Median*
>
> *75th percentile*
>
> *Max*
>
> Duration
>
> 13 ms
>
> 13 ms
>
> 14 ms
>
> 14 ms
>
> 15 ms
>
> GC Time
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> Shuffle Write Size / Records
>
> 27.0 B / 1
>
> 27.0 B / 1
>
> 27.0 B / 1
>
> 27.0 B / 1
>
> 27.0 B / 1
>    Aggregated Metrics by Executor
>
> *Executor ID*
>
> *Address*
>
> *Task Time*
>
> *Total Tasks*
>
> *Failed Tasks*
>
> *Succeeded Tasks*
>
> *Shuffle Write Size / Records*
>
> 2
>
> bfravicsvr81441-cld.opentsp.com:47790
>
> 0.9 s
>
> 51
>
> 0
>
> 51
>
> 1377.0 B / 51
>    Tasks
>
> *Index*
>
> *ID*
>
> *Attempt*
>
> *Status*
>
> *Locality Level*
>
> *Executor ID / Host*
>
> *Launch Time*
>
> *Duration*
>
> *GC Time*
>
> *Write Time*
>
> *Shuffle Write Size / Records*
>
> *Errors*
>
> 50
>
> 3549
>
> 0
>
> SUCCESS
>
> PROCESS_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:10
>
> 13 ms
>
> 27.0 B / 1
>
> 0
>
> 3550
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:13
>
> 15 ms
>
> 27.0 B / 1
>
> 1
>
> 3551
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:13
>
> 14 ms
>
> 27.0 B / 1
>
> 2
>
> 3552
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:13
>
> 14 ms
>
> 27.0 B / 1
>
> 3
>
> 3553
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:13
>
> 14 ms
>
> 27.0 B / 1
>
> 4
>
> 3554
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:13
>
> 13 ms
>
> 27.0 B / 1
>
> 5
>
> 3555
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:13
>
> 13 ms
>
> 27.0 B / 1
>
> 6
>
> 3556
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:13
>
> 14 ms
>
> 27.0 B / 1
>
> 7
>
> 3557
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:13
>
> 15 ms
>
> 27.0 B / 1
>
> 8
>
> 3558
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 9
>
> 3559
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 10
>
> 3560
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 11
>
> 3561
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 12
>
> 3562
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 13
>
> 3563
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 15 ms
>
> 27.0 B / 1
>
> 14
>
> 3564
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 15
>
> 3565
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 16
>
> 3566
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 17
>
> 3567
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 18
>
> 3568
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 19
>
> 3569
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 20
>
> 3570
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 21
>
> 3571
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 22
>
> 3572
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 23
>
> 3573
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 24
>
> 3574
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 25
>
> 3575
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 26
>
> 3576
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 27
>
> 3577
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 28
>
> 3578
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 15 ms
>
> 27.0 B / 1
>
> 29
>
> 3579
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 30
>
> 3580
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 31
>
> 3581
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 32
>
> 3582
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 33
>
> 3583
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 15 ms
>
> 27.0 B / 1
>
> 34
>
> 3584
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 35
>
> 3585
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 36
>
> 3586
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 37
>
> 3587
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 38
>
> 3588
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 39
>
> 3589
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 40
>
> 3590
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 41
>
> 3591
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 42
>
> 3592
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 13 ms
>
> 27.0 B / 1
>
> 43
>
> 3593
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 15 ms
>
> 27.0 B / 1
>
> 44
>
> 3594
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 45
>
> 3595
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 46
>
> 3596
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 47
>
> 3597
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 48
>
> 3598
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 14 ms
>
> 27.0 B / 1
>
> 49
>
> 3599
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 2 / bfravicsvr81441-cld.opentsp.com
>
> 2015/07/09 11:09:14
>
> 15 ms
>
> 27.0 B / 1
>
>
>
>
>
> *From:* Tathagata Das [mailto:tdas@databricks.com]
> *Sent:* donderdag 9 juli 2015 10:28
> *To:* Michel Hubert
> *Cc:* user@spark.apache.org
> *Subject:* Re: spark streaming performance
>
>
>
> Can you provide the full task list of all the 50 tasks? I am curious to
> find out whats going on.
>
>
>
> And spark.executor.cores needs to be set. Or --executor-cores in
> spark-submit
>
>
>
>
>
> On Thu, Jul 9, 2015 at 1:05 AM, Michel Hubert <michelh@vsnsystemen.nl>
> wrote:
>
>
>
>
>
> Hi,
>
>
>
> Thanks for you reply!
>
>
>
>
>
> *@What were the number of cores in the executor? It could be that you had
> only one core in the executor which did all the 50 tasks serially so 50
> task X 15 ms = ~ 1 second. *
>
>
>
> What metric/page shows the number of cores in the executor? Should I
> specify “spark.executor.cores” when submitting the application?
>
>
>
>
>
>
>
> @*Could you take a look at the task details in the stage page to see when
> the tasks were added to see whether it explains the 5 second?*
>
>
>
> Stage 704 seems to be submitted 4 sec later. What could be the reason
> other than stage 703 finished in 4 to 5 sec?
>
>
>
>
>
> I’ve copy pasted some more detailed info (masked out some servernames
> xxxNUMBERxxx).
> Completed Stages (530)
>
> *Stage Id*
>
> *Description*
>
> *Submitted*
>
> *Duration*
>
> *Tasks: Succeeded/Total*
>
> *Input*
>
> *Output*
>
> *Shuffle Read*
>
> *Shuffle Write*
>
> 706
>
> print at LogonAnalysis.scala:217
> <http://bfravicsvr81434.opentsp.com:8088/proxy/application_1432720182377_0443/stages/stage?id=706&attempt=0>
> +details
>
> 2015/07/09 09:56:24
>
> 14 ms
>
> 1/1
>
> 704
>
> print at LogonAnalysis.scala:217
> <http://bfravicsvr81434.opentsp.com:8088/proxy/application_1432720182377_0443/stages/stage?id=704&attempt=0>
> +details
>
> 2015/07/09 09:56:24
>
> 74 ms
>
> 1/1
>
> 1377.0 B
>
> 703
>
> count at LogonAnalysis.scala:217
> <http://bfravicsvr81434.opentsp.com:8088/proxy/application_1432720182377_0443/stages/stage?id=703&attempt=0>
> +details
>
> 2015/07/09 09:56:20
>
> 5 s
>
> 51/51
>
> 1377.0 B
>
>
>
>
>
>
>
> *Details for Stage 703*
>
> ·         *Total task time across all tasks: *0.6 s
>
> ·         *Shuffle write: *1377.0 B / 51
>
> *Show additional metrics*
>
> *Summary Metrics for 51 Completed Tasks*
>
> *Metric*
>
> *Min*
>
> *25th percentile*
>
> *Median*
>
> *75th percentile*
>
> *Max*
>
> Duration
>
> 11 ms
>
> 12 ms
>
> 12 ms
>
> 13 ms
>
> 23 ms
>
> GC Time
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> Shuffle Write Size / Records
>
> 27.0 B / 1
>
> 27.0 B / 1
>
> 27.0 B / 1
>
> 27.0 B / 1
>
> 27.0 B / 1
>
> *Aggregated Metrics by Executor*
>
> *Executor ID*
>
> *Address*
>
> *Task Time*
>
> *Total Tasks*
>
> *Failed Tasks*
>
> *Succeeded Tasks*
>
> *Shuffle Write Size / Records*
>
> 1
>
> xxx81440xxx:51081
>
> 0.8 s
>
> 51
>
> 0
>
> 51
>
> 1377.0 B / 51
>
> *Tasks*
>
> *Index*
>
> *ID*
>
> *Attempt*
>
> *Status*
>
> *Locality Level*
>
> *Executor ID / Host*
>
> *Launch Time*
>
> *Duration*
>
> *GC Time*
>
> *Write Time*
>
> *Shuffle Write Size / Records*
>
> *Errors*
>
> 50
>
> 9308
>
> 0
>
> SUCCESS
>
> PROCESS_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:20
>
> 11 ms
>
> 27.0 B / 1
>
> 0
>
> 9309
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 14 ms
>
> 27.0 B / 1
>
> 1
>
> 9310
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 12 ms
>
> 27.0 B / 1
>
> 2
>
> 9311
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 12 ms
>
> 27.0 B / 1
>
> 3
>
> 9312
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 12 ms
>
> 27.0 B / 1
>
> 4
>
> 9313
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 13 ms
>
> 27.0 B / 1
>
> 5
>
> 9314
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 11 ms
>
> 27.0 B / 1
>
> 6
>
> 9315
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 13 ms
>
> 27.0 B / 1
>
> 7
>
> 9316
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 14 ms
>
> 27.0 B / 1
>
> 8
>
> 9317
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 13 ms
>
> 27.0 B / 1
>
> 9
>
> 9318
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 12 ms
>
> 27.0 B / 1
>
> 10
>
> 9319
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 12 ms
>
> 27.0 B / 1
>
> 11
>
> 9320
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 14 ms
>
> 27.0 B / 1
>
> 12
>
> 9321
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:23
>
> 12 ms
>
> 27.0 B / 1
>
> 13
>
> 9322
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 11 ms
>
> 27.0 B / 1
>
> 14
>
> 9323
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 15
>
> 9324
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 16
>
> 9325
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 17
>
> 9326
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 11 ms
>
> 27.0 B / 1
>
> 18
>
> 9327
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 19
>
> 9328
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 20
>
> 9329
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 13 ms
>
> 27.0 B / 1
>
> 21
>
> 9330
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 22
>
> 9331
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 23
>
> 9332
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 13 ms
>
> 27.0 B / 1
>
> 24
>
> 9333
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 25
>
> 9334
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 11 ms
>
> 27.0 B / 1
>
> 26
>
> 9335
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 11 ms
>
> 27.0 B / 1
>
> 27
>
> 9336
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 13 ms
>
> 27.0 B / 1
>
> 28
>
> 9337
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 13 ms
>
> 27.0 B / 1
>
> 29
>
> 9338
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 11 ms
>
> 27.0 B / 1
>
> 30
>
> 9339
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 31
>
> 9340
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 13 ms
>
> 27.0 B / 1
>
> 32
>
> 9341
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 13 ms
>
> 27.0 B / 1
>
> 33
>
> 9342
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 34
>
> 9343
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 11 ms
>
> 27.0 B / 1
>
> 35
>
> 9344
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 36
>
> 9345
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 37
>
> 9346
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 11 ms
>
> 27.0 B / 1
>
> 38
>
> 9347
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 11 ms
>
> 27.0 B / 1
>
> 39
>
> 9348
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 23 ms
>
> 27.0 B / 1
>
> 40
>
> 9349
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 13 ms
>
> 27.0 B / 1
>
> 41
>
> 9350
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 42
>
> 9351
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 43
>
> 9352
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 11 ms
>
> 27.0 B / 1
>
> 44
>
> 9353
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 13 ms
>
> 27.0 B / 1
>
> 45
>
> 9354
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 46
>
> 9355
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 47
>
> 9356
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 48
>
> 9357
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 12 ms
>
> 27.0 B / 1
>
> 49
>
> 9358
>
> 0
>
> SUCCESS
>
> RACK_LOCAL
>
> 1 / xxx81440xxx
>
> 2015/07/09 09:56:24
>
> 13 ms
>
> 27.0 B / 1
>
>
>
>
>
> *Streaming*
>
> ·         *Started at: *Thu Jul 09 09:27:01 CEST 2015
>
> ·         *Time since start: *24 minutes
>
> ·         *Network receivers: *1
>
> ·         *Batch interval: *10 seconds
>
> ·         *Processed batches: *143
>
> ·         *Waiting batches: *1
>
> ·         *Received records: *295016
>
> ·         *Processed records: *292937
>
>
>
>  *Statistics over last 100 processed batches*
>
> *Receiver Statistics*
>
> ·         *Receiver*
>
> ·         *Status*
>
> ·         *Location*
>
> ·         *Records in last batch*
>
> ·         *[2015/07/09 09:51:01]*
>
> ·         *Minimum rate*
>
> ·         *[records/sec]*
>
> ·         *Median rate*
>
> ·         *[records/sec]*
>
> ·         *Maximum rate*
>
> ·         *[records/sec]*
>
> ·         *Last Error*
>
> SocketReceiver-0
>
> ACTIVE
>
> xxx81441xx
>
> 2.1 K
>
> 196
>
> 205
>
> 218
>
> -
>
> *Batch Processing Statistics*
>
> *Metric*
>
> *Last batch*
>
> *Minimum*
>
> *25th percentile*
>
> *Median*
>
> *75th percentile*
>
> *Maximum*
>
> Processing Time
>
> 4 seconds 679 ms
>
> 4 seconds 653 ms
>
> 4 seconds 702 ms
>
> 4 seconds 736 ms
>
> 4 seconds 775 ms
>
> 4 seconds 926 ms
>
> Scheduling Delay
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> 1 ms
>
> Total Delay
>
> 4 seconds 679 ms
>
> 4 seconds 653 ms
>
> 4 seconds 702 ms
>
> 4 seconds 737 ms
>
> 4 seconds 775 ms
>
> 4 seconds 926 ms
>
>
>
>
>
>
>
> *Executors (3)*
>
> ·         *Memory:* 0.0 B Used (1326.0 MB Total)
>
> ·         *Disk:* 0.0 B Used
>
> *Executor ID*
>
> *Address*
>
> *RDD Blocks*
>
> *Memory Used*
>
> *Disk Used*
>
> *Active Tasks*
>
> *Failed Tasks*
>
> *Complete Tasks*
>
> *Total Tasks*
>
> *Task Time*
>
> *Input*
>
> *Shuffle Read*
>
> *Shuffle Write*
>
> *Logs*
>
> *Thread Dump*
>
> 1
>
> xxx81440xxx:51081
>
> 0
>
> 0.0 B / 530.3 MB
>
> 0.0 B
>
> 0
>
> 0
>
> 8159
>
> 8159
>
> 3.0 m
>
> 0.0 B
>
> 392.0 B
>
> 206.8 KB
>
> stdout
>
> stderr
>
> Thread Dump
>
> 2
>
> xxx81441xxx:34114
>
> 0
>
> 0.0 B / 530.3 MB
>
> 0.0 B
>
> 1
>
> 0
>
> 35
>
> 36
>
> 5.1 s
>
> 0.0 B
>
> 364.0 B
>
> 700.0 B
>
> stdout
>
> stderr
>
> Thread Dump
>
> <driver>
>
> xxx81436xxx:38325
>
> 0
>
> 0.0 B / 265.4 MB
>
> 0.0 B
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0 ms
>
> 0.0 B
>
> 0.0 B
>
> 0.0 B
>
> Thread Dump
>
>
>
>
>
>
>
>
>
>
>
>
>
> *From:* Tathagata Das [mailto:tdas@databricks.com]
> *Sent:* donderdag 9 juli 2015 09:38
> *To:* Michel Hubert
> *Cc:* user@spark.apache.org
> *Subject:* Re: spark streaming performance
>
>
>
> What were the number of cores in the executor? It could be that you had
> only one core in the executor which did all the 50 tasks serially so 50
> task X 15 ms = ~ 1 second.
>
> Could you take a look at the task details in the stage page to see when
> the tasks were added to see whether it explains the 5 second?
>
>
>
> On Thu, Jul 9, 2015 at 12:21 AM, Michel Hubert <michelh@vsnsystemen.nl>
> wrote:
>
>
>
> Hi,
>
>
>
> I’ve developed a POC Spark Streaming application.
>
> But it seems to perform better on my development machine  than on our
> cluster.
>
> I submit it to yarn on our cloudera cluster.
>
>
>
> But my first question is more detailed:
>
>
>
> In de application UI (:4040) I see in the streaming section that the batch
> processing took 6 sec.
>
> Then when I look at the stages I indeed see a stage with duration 5s.
>
>
>
> For example:
>
> 1678
>
> map at LogonAnalysis.scala:215+details
>
> 2015/07/09 09:17:00
>
> 5 s
>
> 50/50
>
> 173.5 KB
>
>
>
> But when I look into the details of state 1678 it tells me the duration
> was 14 ms and the aggregated metrics by executor has 1.0s as Task Time.
>
> What is responsible for the gap between 14 ms, 1s and 5 sec?
>
>
>
>
>
> *Details for Stage 1678*
>
> ·         *Total task time across all tasks: *0.8 s
>
> ·         *Shuffle write: *173.5 KB / 2031
>
>  *Show additional metrics*
>
> *Summary Metrics for 50 Completed Tasks*
>
> *Metric*
>
> *Min*
>
> *25th percentile*
>
> *Median*
>
> *75th percentile*
>
> *Max*
>
> Duration
>
> 14 ms
>
> 14 ms
>
> 15 ms
>
> 15 ms
>
> 24 ms
>
> GC Time
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> 0 ms
>
> Shuffle Write Size / Records
>
> 2.6 KB / 28
>
> 3.1 KB / 35
>
> 3.5 KB / 42
>
> 3.9 KB / 46
>
> 4.4 KB / 53
>
> *Aggregated Metrics by Executor*
>
> *Executor ID*
>
> *Address*
>
> *Task Time*
>
> *Total Tasks*
>
> *Failed Tasks*
>
> *Succeeded Tasks*
>
> *Shuffle Write Size / Records*
>
> 2
>
> xxxx:44231
>
> 1.0 s
>
> 50
>
> 0
>
> 50
>
> 173.5 KB / 2031
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Mime
View raw message