spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Haviv <daniel.ha...@veracity-group.com>
Subject Re: Switching broadcast mechanism from torrrent
Date Sun, 19 Jun 2016 13:51:14 GMT
Hi,
Just updating on my findings for future reference.
The problem was that after refactoring my code I ended up with a scala
object which held SparkContext as a member, eg:
object A  {
     sc: SparkContext = new SparkContext
     def mapFunction  {}
}

and when I called rdd.map(A.mapFunction) it failed as A.sc is not
serializable.

Thanks,
Daniel

On Tue, Jun 7, 2016 at 10:13 AM, Takeshi Yamamuro <linguin.m.s@gmail.com>
wrote:

> Hi,
>
> Since `HttpBroadcastFactory` has already been removed in master, so
> you cannot use the broadcast mechanism in future releases.
>
> Anyway, I couldn't find a root cause only from the stacktraces...
>
> // maropu
>
>
>
>
> On Mon, Jun 6, 2016 at 2:14 AM, Daniel Haviv <
> daniel.haviv@veracity-group.com> wrote:
>
>> Hi,
>> I've set  spark.broadcast.factory to
>> org.apache.spark.broadcast.HttpBroadcastFactory and it indeed resolve my
>> issue.
>>
>> I'm creating a dataframe which creates a broadcast variable internally
>> and then fails due to the torrent broadcast with the following stacktrace:
>> Caused by: org.apache.spark.SparkException: Failed to get
>> broadcast_3_piece0 of broadcast_3
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138)
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138)
>>         at scala.Option.getOrElse(Option.scala:120)
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:137)
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
>>         at scala.collection.immutable.List.foreach(List.scala:318)
>>         at org.apache.spark.broadcast.TorrentBroadcast.org
>> $apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:120)
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:175)
>>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1220)
>>
>> I'm using spark 1.6.0 on CDH 5.7
>>
>> Thanks,
>> Daniel
>>
>>
>> On Wed, Jun 1, 2016 at 5:52 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>
>>> I found spark.broadcast.blockSize but no parameter to switch broadcast
>>> method.
>>>
>>> Can you describe the issues with torrent broadcast in more detail ?
>>>
>>> Which version of Spark are you using ?
>>>
>>> Thanks
>>>
>>> On Wed, Jun 1, 2016 at 7:48 AM, Daniel Haviv <
>>> daniel.haviv@veracity-group.com> wrote:
>>>
>>>> Hi,
>>>> Our application is failing due to issues with the torrent broadcast, is
>>>> there a way to switch to another broadcast method ?
>>>>
>>>> Thank you.
>>>> Daniel
>>>>
>>>
>>>
>>
>
>
> --
> ---
> Takeshi Yamamuro
>

Mime
View raw message