spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Sharma <resolve...@gmail.com>
Subject spark dataset.cache is not thread safe
Date Sun, 21 Jul 2019 23:18:51 GMT
Hi , I wrote a code in future block which read data from dataset and cache
it which is used later in the code. I faced a issue that data.cached() data
will be replaced by concurrent running thread . Is there any way we can
avoid this condition.

val dailyData = callDetailsDS.collect.toList
val adjustedData = dailyData.map(callDataPerDay => Future{



  val data = callDetailsDS.filter((callDetailsDS(DateColumn) geq (some
conditional date ))
    data.cache()

....................

}

Mime
View raw message