spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuri Oleynikov (‫יורי אולייניקוב‬‎)" <yur...@gmail.com>
Subject Re: Caching
Date Mon, 07 Dec 2020 18:51:47 GMT
You are using same csv twice?

Отправлено с iPhone

> 7 дек. 2020 г., в 18:32, Amit Sharma <resolve123@gmail.com> написал(а):
> 
> 
> Hi All, I am using caching in my code. I have a DF like
> val  DF1 = read csv.
> val DF2 = DF1.groupBy().agg().select(.....)
> 
> Val DF3 =  read csv .join(DF1).join(DF2)
>   DF3 .save.
> 
> If I do not cache DF2 or Df1 it is taking longer time  . But i am doing 1 action only
why do I need to cache.
> 
> Thanks
> Amit
> 
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message