spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean R. Owen (Jira)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-28575) Time lag between two consecutive spark actions using Spark 2.3.1
Date Sat, 26 Oct 2019 23:08:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-28575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean R. Owen resolved SPARK-28575.
----------------------------------
    Resolution: Invalid

There's not enough information here. We don't even know what your code is doing in between.
There could be other sources of legitimate difference here.

> Time lag between two consecutive spark actions using Spark 2.3.1
> ----------------------------------------------------------------
>
>                 Key: SPARK-28575
>                 URL: https://issues.apache.org/jira/browse/SPARK-28575
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 2.3.1
>            Reporter: Kushal Mahajan
>            Priority: Major
>         Attachments: spark_2.1_screenshot.PNG, spark_2.3_screenshot.PNG
>
>
> Steps to reproduce:
>  # Read a directory(consisting of txt files) using spark context's wholetextfile method
>  # Perform transformation on the resultant paired rdd
>  # Perform an action(foreach) on each entry corresponding to each txt file
>  # Time lag can be seen between these actions in Spark UI. 
> The action itself is not taking that much time. There is time lag between start time
for each action(excluding the time taken by the job itself). Kindly refer to the attachments
> PS: This time lag is not seen when running the job in Spark 2.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message