spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aditya" <>
Subject Spark RDD and Memory
Date Thu, 22 Sep 2016 14:54:24 GMT

Suppose I have two RDDs
val textFile = sc.textFile("/user/emp.txt")
val textFile1 = sc.textFile("/user/emp1.xt")

Later I perform a join operation on above two RDDs
val join = textFile.join(textFile1)

And there are subsequent transformations without including textFile and 
textFile1 further and an action to start the execution.

When action is called, textFile and textFile1 will be loaded in memory 
first. Later join will be performed and kept in memory.
My question is once join is there memory and is used for subsequent 
execution, what happens to textFile and textFile1 RDDs. Are they still 
kept in memory untill the full lineage graph is completed or is it 
destroyed once its use is over? If it is kept in memory, is there any 
way I can explicitly remove it from memory to free the memory?

To unsubscribe e-mail:

View raw message