spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Dailey <>
Subject Are Task Closures guaranteed to be accessed by only one Thread?
Date Wed, 05 Oct 2016 16:23:38 GMT
Looking at the programming guide
for Spark 1.6.1, it states
> Prior to execution, Spark computes the task’s closure. The closure is
those variables and methods which must be visible for the executor to
perform its computations on the RDD
> The variables within the closure sent to each executor are now copies

So my question is, will an executor access a single copy of the closure
with more than one thread?  I ask because I want to know if I can ignore
thread-safety in a function I write.  Take a look at this gist as a
simplified example with a thread-unsafe operation being passed to map():

This is for Spark Streaming, but I suspect the answer is the same between
batch and streaming.

Thanks for any help,

View raw message