spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyukjin Kwon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-28575) Time lag between two consecutive spark actions using Spark 2.3.1
Date Fri, 16 Aug 2019 04:58:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-28575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908710#comment-16908710
] 

Hyukjin Kwon commented on SPARK-28575:
--------------------------------------

[~kumahaja], can you please show some codes for your reproduce steps?

> Time lag between two consecutive spark actions using Spark 2.3.1
> ----------------------------------------------------------------
>
>                 Key: SPARK-28575
>                 URL: https://issues.apache.org/jira/browse/SPARK-28575
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 2.3.1
>            Reporter: Kushal Mahajan
>            Priority: Major
>         Attachments: spark_2.1_screenshot.PNG, spark_2.3_screenshot.PNG
>
>
> Steps to reproduce:
>  # Read a directory(consisting of txt files) using spark context's wholetextfile method
>  # Perform transformation on the resultant paired rdd
>  # Perform an action(foreach) on each entry corresponding to each txt file
>  # Time lag can be seen between these actions in Spark UI. 
> The action itself is not taking that much time. There is time lag between start time
for each action(excluding the time taken by the job itself). Kindly refer to the attachments
> PS: This time lag is not seen when running the job in Spark 2.1.1



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message