Dataset does have storageLevel. So you can use isCached = (storageLevel != StorageLevel.NONE) as a test.

Arguably isCached could be added to dataset too, shouldn't be a controversial change.

On Fri, 1 Sep 2017 at 17:31, Nathan Kronenfeld <> wrote:
I'm currently porting some of our code from RDDs to Datasets.

With RDDs it's pretty easy to figure out if they are cached or not.

I notice that the catalog has a function for determining this on Datasets too, but it's private[sql].  Is there any reason for it not to be public?  Is there any way at the moment to determine if a dataset is cached or not?

Thanks in advance
               -Nathan Kronenfeld