spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomas Bartalos <tomas.barta...@gmail.com>
Subject Access to live data of cached dataFrame
Date Fri, 17 May 2019 18:37:34 GMT
Hello,

I have a cached dataframe:

spark.read.format("delta").load("/data").groupBy(col("event_hour")).count.cache

I would like to access the "live" data for this data frame without deleting
the cache (using unpersist()). Whatever I do I always get the cached data
on subsequent queries. Even adding new column to the query doesn't help:

spark.read.format("delta").load("/data").groupBy(col("event_hour")).count.withColumn("dummy",
lit("dummy"))


I'm able to workaround this using cached sql view, but I couldn't find a
pure dataFrame solution.

Thank you,
Tomas

Mime
View raw message