spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hanumath Rao Maduri <hanu....@gmail.com>
Subject Re: Spark RDD and Memory
Date Thu, 22 Sep 2016 17:09:43 GMT
Hello Aditya,

After an intermediate action has been applied you might want to call
rdd.unpersist() to let spark know that this rdd is no longer required.

Thanks,
-Hanu

On Thu, Sep 22, 2016 at 7:54 AM, Aditya <aditya.calangutkar@augmentiq.co.in>
wrote:

> Hi,
>
> Suppose I have two RDDs
> val textFile = sc.textFile("/user/emp.txt")
> val textFile1 = sc.textFile("/user/emp1.xt")
>
> Later I perform a join operation on above two RDDs
> val join = textFile.join(textFile1)
>
> And there are subsequent transformations without including textFile and
> textFile1 further and an action to start the execution.
>
> When action is called, textFile and textFile1 will be loaded in memory
> first. Later join will be performed and kept in memory.
> My question is once join is there memory and is used for subsequent
> execution, what happens to textFile and textFile1 RDDs. Are they still kept
> in memory untill the full lineage graph is completed or is it destroyed
> once its use is over? If it is kept in memory, is there any way I can
> explicitly remove it from memory to free the memory?
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Mime
View raw message