spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guillermo Ortiz Fernández <guillermo.ortiz.f...@gmail.com>
Subject Trying to improve performance of the driver.
Date Thu, 13 Sep 2018 15:48:05 GMT
I have a process in Spark Streamin which lasts 2 seconds. When I check
where the time is spent I see about 0.8s-1s in processing time although the
global time is 2s. This one second is spent in the driver.
I reviewed the code which is executed by the driver and I commented some of
this code with the same result. So I don't have any idea where the time is
spent.

Righ now, I'm executing in client mode from one the node inside the cluster
so I can't set the number the cores to the driver (although I don't think
that it's going to make the difference) .

How could I know where the driver is spending the time? I'm not sure if it
possible to improve the performance in this point or that second is spent
scheduling the graph of each microbatch mainly

Mime
View raw message