spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniele Foroni <>
Subject Dataset Caching and Unpersisting
Date Wed, 02 May 2018 11:19:21 GMT
Hi all,

I am having troubles with caching and unpersisting a dataset.
I have a cycle that at each iteration filters my dataset.
I realized that caching every x steps (e.g., 50 steps) gives good performance.

However, after a certain number of caching operations, it seems that the memory used for caching
is filled, so I think I should have to unpersist the old cached dataset.

This is my code:

I tried to use an external variable to cache and unpersist it but it doesn’t seem to solve
the problem (maybe I used it in the wrong way).
Do you kindly have any suggestion?

Thank you for your support!

View raw message