spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlos Fuertes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-2017) web ui stage page becomes unresponsive when the number of tasks is large
Date Tue, 19 Aug 2014 03:47:19 GMT

    [ https://issues.apache.org/jira/browse/SPARK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101810#comment-14101810
] 

Carlos Fuertes commented on SPARK-2017:
---------------------------------------

Hi,

I addressed the table css rendering slowness by creating a different css class "spark-simple-table",
under core/main/resources/ui.static/spark.css, and using it when rendering those tables, in
particular by calling the "listingTable" method with "simpleTable" set to true as an optional
param. Otherwise by default you use the bootstrap css table class that you had.

But according to the the simple tests that I have done, what you gain from that is marginal
(see also what I posted at SPARK-2016). The real issue is the responsiveness of the page after
you it has loaded. In order to really improve that, the best solution came from using ajax,
js and JSON to load the data asynchronously. That way the base html page is much much smaller,
loads instantly, and the web browser remains responsive all the time: As I described also
in SPARK-2016 no matter what css table class you use, for big table sizes (I have tested it
with data sizes up to 15MB which is roughly the table sizes you generate with 50000 in the
examples above) after you load pages with big tables the browser becomes completely unresponsive,
however if you load the data using an ajax call, the page remains perfectly browsable.

In pull request #1682 by default you use ajax and js to render those tables. I created a config
variable "spark.ui.jsRenderingEnabled" which by default is true. If you set it to false in
your properties, you go back to the original way of creating a big html with all the data
embedded in it.

Of course all this is without using pagination to show the data, that could also be done.
But from what I am seeing, using JSON to serve the data gives you much more flexibility going
forward, for other uses and extensions, and increases overall responsiveness of pages no matter
how you finally render it.




> web ui stage page becomes unresponsive when the number of tasks is large
> ------------------------------------------------------------------------
>
>                 Key: SPARK-2017
>                 URL: https://issues.apache.org/jira/browse/SPARK-2017
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Web UI
>            Reporter: Reynold Xin
>              Labels: starter
>
> {code}
> sc.parallelize(1 to 1000000, 1000000).count()
> {code}
> The above code creates one million tasks to be executed. The stage detail web ui page
takes forever to load (if it ever completes).
> There are again a few different alternatives:
> 0. Limit the number of tasks we show.
> 1. Pagination
> 2. By default only show the aggregate metrics and failed tasks, and hide the successful
ones.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message