spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edmond La Chance (Jira)" <>
Subject [jira] [Created] (SPARK-29315) RDD.cache() called early creates problems
Date Tue, 01 Oct 2019 12:32:00 GMT
Edmond La Chance created SPARK-29315:

             Summary: RDD.cache() called early creates problems
                 Key: SPARK-29315
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.4.4
         Environment: Apache Spark 2.4.4

Windows 10
            Reporter: Edmond La Chance

First issue I post here.  I noticed that when I call RDD.cache() early in my code, the results
are all wrong!
If I remove the call to cache(), or I add cache later in the code, after the first map transformation,
it works fine.
The graph is created from a data structure that already contains the random.


I have posted versions that work, and versions that don't work here in this gist.


This message was sent by Atlassian Jira

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message