spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Parag Mohanty <>
Subject [Spark conf setting] spark.sql.parquet.cacheMetadata = true still invalidates cache in memory.
Date Thu, 01 Jul 2021 14:25:12 GMT
Hi Team
I am trying to read a parquet file, cache it and then do transformation and
overwrite the parquet file in a session.
But first count action doesn't cache the dataframe.
It gets cached while caching the transformed dataframe.
Even if the spark.sql.parquet.cacheMetadata = true still the write
operation destroys the cache.
Is it expected? What is the relevance of this conf setting ?

We are using pyspark on spark cluster mode.
Parag Mohanty

View raw message