spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bahubali Jain <bahub...@gmail.com>
Subject Dataframe size using RDDStorageInfo objects
Date Sat, 17 Mar 2018 06:05:20 GMT
Hi,
I am trying to figure out a way to find the size of *persisted *dataframes
using the *sparkContext.getRDDStorageInfo() *
RDDStorageInfo object has information related to the number of bytes stored
in memory and on disk.

For eg:
I have 3 dataframes which i have cached.
df1.cache()
df2.cache()
df3.cache()

The call to sparkContext.getRDDStorageInfo()  would return array of 3
objects of RDDStorageInfo since I have cached 3 data frames.
But there is no way to map between df1,df2,df3 to each of the
RDDStorageInfo objects.

Is there any better way out?

Thanks,
Baahu

Mime
View raw message