RDD is kind of a pointer to the actual data. Unless it's cached, we don't need to clean up the RDD.

On Tue, May 21, 2019 at 1:48 PM Nasrulla Khan Haris <Nasrulla.Khan@microsoft.com.invalid> wrote:

HI Spark developers,

 

Can someone point out the code where RDD objects go out of scope ?. I found the contextcleaner code in which only persisted RDDs are cleaned up in regular intervals if the RDD is registered to cleanup. I have not found where the destructor for RDD object is invoked. I am trying to understand when RDD cleanup happens when the RDD is not persisted.

 

Thanks in advance, appreciate your help.

Nasrulla