spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlos Fuertes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-2017) web ui stage page becomes unresponsive when the number of tasks is large
Date Fri, 01 Aug 2014 20:31:39 GMT

    [ https://issues.apache.org/jira/browse/SPARK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082909#comment-14082909
] 

Carlos Fuertes commented on SPARK-2017:
---------------------------------------

I have been digging in on why the bad performance on rendering the tables. As it happens the
bottleneck is in the css that is currently used for rendering the tables. In particular bootstrap.css
and this type of definition:

.table-striped tbody>tr:nth-child(odd)>td,.table-striped tbody>tr:nth-child(odd)>th{background-color:#f9f9f9;}

The call to nth-child(odd) with large tables slows everything to the point that for big table
the whole rendering stalls.

I have made a change in the pull request [1682] where I use a custom very simple css table
styling (respecting the same overall look and and feel but with no nth-child call). I have
not changed the sortable option of the tables.

Now if you run for example 

sc.parallelize(1 to 1000000, 50000).count()

loading the whole page /stages/stage/?id=0 takes ~ 11 secs. Of those sec,  2.10 s are spend
loading the JSON from the driver (a total of 16.7MB) and the rest in the rendering of the
table. Since the JSON request is async, you can see immediately the rest of the page nonetheless.

I think this would solve the responsiveness problem for reasonably large number of tasks as
a first pass. I have also apply the same solution to all tables under Storage where the same
thing was happening [SPARK-2016].

> web ui stage page becomes unresponsive when the number of tasks is large
> ------------------------------------------------------------------------
>
>                 Key: SPARK-2017
>                 URL: https://issues.apache.org/jira/browse/SPARK-2017
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Web UI
>            Reporter: Reynold Xin
>              Labels: starter
>
> {code}
> sc.parallelize(1 to 1000000, 1000000).count()
> {code}
> The above code creates one million tasks to be executed. The stage detail web ui page
takes forever to load (if it ever completes).
> There are again a few different alternatives:
> 0. Limit the number of tasks we show.
> 1. Pagination
> 2. By default only show the aggregate metrics and failed tasks, and hide the successful
ones.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message