spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: RDD object Out of scope.
Date Wed, 22 May 2019 01:23:37 GMT
I'm not clear what you're asking. An RDD itself is just an object in
the JVM. It will be garbage collected if there are no references. What
else would there be to clean up in your case? ContextCleaner handles
cleaned up of persisted RDDs, etc.

On Tue, May 21, 2019 at 7:39 PM Nasrulla Khan Haris
<Nasrulla.Khan@microsoft.com.invalid> wrote:
>
> I am trying to find the code that cleans up uncached RDD.
>
>
>
> Thanks,
>
> Nasrulla
>
>
>
> From: Charoes <charoes@gmail.com>
> Sent: Tuesday, May 21, 2019 5:10 PM
> To: Nasrulla Khan Haris <Nasrulla.Khan@microsoft.com.invalid>
> Cc: Wenchen Fan <cloud0fan@gmail.com>; dev@spark.apache.org
> Subject: Re: RDD object Out of scope.
>
>
>
> If you cached a RDD and hold a reference of that RDD in your code, then your RDD will
NOT be cleaned up.
>
> There is a ReferenceQueue in ContextCleaner, which is used to keep tracking the reference
of RDD, Broadcast, and Accumulator etc.
>
>
>
> On Wed, May 22, 2019 at 1:07 AM Nasrulla Khan Haris <Nasrulla.Khan@microsoft.com.invalid>
wrote:
>
> Thanks for reply Wenchen, I am curious as what happens when RDD goes out of scope when
it is not cached.
>
>
>
> Nasrulla
>
>
>
> From: Wenchen Fan <cloud0fan@gmail.com>
> Sent: Tuesday, May 21, 2019 6:28 AM
> To: Nasrulla Khan Haris <Nasrulla.Khan@microsoft.com.invalid>
> Cc: dev@spark.apache.org
> Subject: Re: RDD object Out of scope.
>
>
>
> RDD is kind of a pointer to the actual data. Unless it's cached, we don't need to clean
up the RDD.
>
>
>
> On Tue, May 21, 2019 at 1:48 PM Nasrulla Khan Haris <Nasrulla.Khan@microsoft.com.invalid>
wrote:
>
> HI Spark developers,
>
>
>
> Can someone point out the code where RDD objects go out of scope ?. I found the contextcleaner
code in which only persisted RDDs are cleaned up in regular intervals if the RDD is registered
to cleanup. I have not found where the destructor for RDD object is invoked. I am trying to
understand when RDD cleanup happens when the RDD is not persisted.
>
>
>
> Thanks in advance, appreciate your help.
>
> Nasrulla
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message