spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Takeshi Yamamuro <linguin....@gmail.com>
Subject Re: Switching broadcast mechanism from torrrent
Date Sun, 19 Jun 2016 14:10:10 GMT
How about using `transient` annotations?

// maropu

On Sun, Jun 19, 2016 at 10:51 PM, Daniel Haviv <
daniel.haviv@veracity-group.com> wrote:

> Hi,
> Just updating on my findings for future reference.
> The problem was that after refactoring my code I ended up with a scala
> object which held SparkContext as a member, eg:
> object A  {
>      sc: SparkContext = new SparkContext
>      def mapFunction  {}
> }
>
> and when I called rdd.map(A.mapFunction) it failed as A.sc is not
> serializable.
>
> Thanks,
> Daniel
>
> On Tue, Jun 7, 2016 at 10:13 AM, Takeshi Yamamuro <linguin.m.s@gmail.com>
> wrote:
>
>> Hi,
>>
>> Since `HttpBroadcastFactory` has already been removed in master, so
>> you cannot use the broadcast mechanism in future releases.
>>
>> Anyway, I couldn't find a root cause only from the stacktraces...
>>
>> // maropu
>>
>>
>>
>>
>> On Mon, Jun 6, 2016 at 2:14 AM, Daniel Haviv <
>> daniel.haviv@veracity-group.com> wrote:
>>
>>> Hi,
>>> I've set  spark.broadcast.factory to
>>> org.apache.spark.broadcast.HttpBroadcastFactory and it indeed resolve my
>>> issue.
>>>
>>> I'm creating a dataframe which creates a broadcast variable internally
>>> and then fails due to the torrent broadcast with the following stacktrace:
>>> Caused by: org.apache.spark.SparkException: Failed to get
>>> broadcast_3_piece0 of broadcast_3
>>>         at
>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138)
>>>         at
>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138)
>>>         at scala.Option.getOrElse(Option.scala:120)
>>>         at
>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:137)
>>>         at
>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
>>>         at
>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
>>>         at scala.collection.immutable.List.foreach(List.scala:318)
>>>         at org.apache.spark.broadcast.TorrentBroadcast.org
>>> $apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:120)
>>>         at
>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:175)
>>>         at
>>> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1220)
>>>
>>> I'm using spark 1.6.0 on CDH 5.7
>>>
>>> Thanks,
>>> Daniel
>>>
>>>
>>> On Wed, Jun 1, 2016 at 5:52 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>>
>>>> I found spark.broadcast.blockSize but no parameter to switch broadcast
>>>> method.
>>>>
>>>> Can you describe the issues with torrent broadcast in more detail ?
>>>>
>>>> Which version of Spark are you using ?
>>>>
>>>> Thanks
>>>>
>>>> On Wed, Jun 1, 2016 at 7:48 AM, Daniel Haviv <
>>>> daniel.haviv@veracity-group.com> wrote:
>>>>
>>>>> Hi,
>>>>> Our application is failing due to issues with the torrent broadcast,
>>>>> is there a way to switch to another broadcast method ?
>>>>>
>>>>> Thank you.
>>>>> Daniel
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>
>


-- 
---
Takeshi Yamamuro

Mime
View raw message