spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davies Liu <dav...@databricks.com>
Subject Re: TorrentBroadcast slow performance
Date Tue, 07 Oct 2014 18:42:50 GMT
Could you create a JIRA for it? maybe it's a regression after
https://issues.apache.org/jira/browse/SPARK-3119.

We will appreciate that if you could tell how to reproduce it.

On Mon, Oct 6, 2014 at 1:27 AM, Guillaume Pitel
<guillaume.pitel@exensa.com> wrote:
> Hi,
>
> I've had no answer to this on user@spark.apache.org, so I post it on dev
> before filing a JIRA (in case the problem or solution is already identified)
>
> We've had some performance issues since switching to 1.1.0, and we finally
> found the origin : TorrentBroadcast seems to be very slow in our setting
> (and it became default with 1.1.0)
>
> The logs of a 4MB variable with TorrentBroadcast : (15s)
>
> 14/10/01 15:47:13 INFO storage.MemoryStore: Block broadcast_84_piece1 stored
> as bytes in memory (estimated size 171.6 KB, free 7.2 GB)
> 14/10/01 15:47:13 INFO storage.BlockManagerMaster: Updated info of block
> broadcast_84_piece1
> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4194304) called
> with curMem=1401611984, maxMem=9168696115
> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84_piece0 stored
> as bytes in memory (estimated size 4.0 MB, free 7.2 GB)
> 14/10/01 15:47:23 INFO storage.BlockManagerMaster: Updated info of block
> broadcast_84_piece0
> 14/10/01 15:47:23 INFO broadcast.TorrentBroadcast: Reading broadcast
> variable 84 took 15.202260006 s
> 14/10/01 15:47:23 INFO storage.MemoryStore: ensureFreeSpace(4371392) called
> with curMem=1405806288, maxMem=9168696115
> 14/10/01 15:47:23 INFO storage.MemoryStore: Block broadcast_84 stored as
> values in memory (estimated size 4.2 MB, free 7.2 GB)
>
> (notice that a 10s lag happens after the "Updated info of block
> broadcast_..." and before the MemoryStore log
>
> And with HttpBroadcast (0.3s):
>
> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Started reading broadcast
> variable 147
> 14/10/01 16:05:58 INFO storage.MemoryStore: ensureFreeSpace(4369376) called
> with curMem=1373493232, maxMem=9168696115
> 14/10/01 16:05:58 INFO storage.MemoryStore: Block broadcast_147 stored as
> values in memory (estimated size 4.2 MB, free 7.3 GB)
> 14/10/01 16:05:58 INFO broadcast.HttpBroadcast: Reading broadcast variable
> 147 took 0.320907112 s 14/10/01 16:05:58 INFO storage.BlockManager: Found
> block broadcast_147 locally
>
> Since Torrent is supposed to perform much better than Http, we suspect a
> configuration error from our side, but are unable to pin it down. Does
> someone have any idea of the origin of the problem ?
>
> For now we're sticking with the HttpBroadcast workaround.
>
> Guillaume
> --
> Guillaume PITEL, Président
> +33(0)626 222 431
>
> eXenSa S.A.S.
> 41, rue Périer - 92120 Montrouge - FRANCE
> Tel +33(0)184 163 677 / Fax +33(0)972 283 705

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message