spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Briggs <wrbri...@gmail.com>
Subject Re: SparkContext & Threading
Date Sat, 06 Jun 2015 18:56:46 GMT
Hi Lee, I'm stuck with only mobile devices for correspondence right now, so
I can't get to shell to play with this issue - this is all supposition; I
think that the lambdas are closing over the context because it's a
constructor parameter to your Runnable class, which is why inlining the
lambdas into your main method doesn't show this issue.

On Sat, Jun 6, 2015, 10:55 AM Lee McFadden <spleeman@gmail.com> wrote:

> Hi Will,
>
> That doesn't seem to be the case and was part of the source of my
> confusion. The code currently in the run method of the runnable works
> perfectly fine with the lambda expressions when it is invoked from the main
> method. They also work when they are invoked from within a separate method
> on the Transforms object.
>
> It was only when putting that same code into another thread that the
> serialization exception occurred.
>
> Examples throughout the spark docs also use lambda expressions a lot -
> surely those examples also would not work if this is always an issue with
> lambdas?
>
> On Sat, Jun 6, 2015, 12:21 AM Will Briggs <wrbriggs@gmail.com> wrote:
>
>> Hi Lee, it's actually not related to threading at all - you would still
>> have the same problem even if you were using a single thread. See this
>> section (
>> https://spark.apache.org/docs/latest/programming-guide.html#passing-functions-to-spark)
>> of the Spark docs.
>>
>>
>> On June 5, 2015, at 5:12 PM, Lee McFadden <spleeman@gmail.com> wrote:
>>
>>
>> On Fri, Jun 5, 2015 at 2:05 PM Will Briggs <wrbriggs@gmail.com> wrote:
>>
>>> Your lambda expressions on the RDDs in the SecondRollup class are
>>> closing around the context, and Spark has special logic to ensure that all
>>> variables in a closure used on an RDD are Serializable - I hate linking to
>>> Quora, but there's a good explanation here:
>>> http://www.quora.com/What-does-Closure-cleaner-func-mean-in-Spark
>>>
>>
>> Ah, I see!  So if I broke out the lambda expressions into a method on an
>> object it would prevent this issue.  Essentially, "don't use lambda
>> expressions when using threads".
>>
>> Thanks again, I appreciate the help.
>>
>

Mime
View raw message