spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mosharaf Chowdhury <mosharafka...@gmail.com>
Subject Re: TorrentBroadcast + persist = bug
Date Wed, 22 Jan 2014 08:58:49 GMT
Milos, thanks for reporting. It seems just initializing the
TorrentBroadcast is causing the trouble (I don't see you calling it at
all).

I'll try to make some time to reproduce it tomorrow and let you know.



--
Mosharaf Chowdhury
http://www.mosharaf.com/


On Wed, Jan 22, 2014 at 12:22 AM, Milos Nikolic
<milos.nikolic83@gmail.com>wrote:

> Anyone to confirm this?
>
> On Jan 20, 2014, at 12:22 PM, Milos Nikolic <milos.nikolic83@gmail.com>
> wrote:
>
> > Hello,
> >
> > I think there is a bug with TorrentBroadcast in the latest release
> (0.8.1). The problem is that even a simple job (e.g., rdd.count) hangs
> waiting for some tasks to finish. Here is how to reproduce the problem:
> >
> > 1) Configure Spark such that node X is the master and also one of the
> workers (e.g., 5 nodes => 5 workers and 1 master)
> > 2) Activate TorrentBroadcast
> > 3) Use Kryo serializer (the problem happens more often than with Java
> serializer)
> > 4) Read some file from HDFS, persist RDD, and call count
> >
> > In almost 80% of the cases (~50% with Java serializer), the count job
> hangs waiting for two tasks from node X to finish. The problem *does not*
> appear if: 1) I separate the master from the worker nodes, or 2) I use
> HttpBroadcast, or 3) I do not persist the RDD.
> >
> > The code is below.
> >
> >  def main(args: Array[String]): Unit = {
> >
> >    System.setProperty("spark.serializer",
> "org.apache.spark.serializer.KryoSerializer")
> >    System.setProperty("spark.kryo.registrator", "test.MyRegistrator")
> >    System.setProperty("spark.broadcast.factory",
> "org.apache.spark.broadcast.TorrentBroadcastFactory")
> >
> >    val sc = new SparkContext(...)
> >
> >    val file = "hdfs://server:9000/user/xxx/Test.out"  // ~750MB
> >    val rdd = sc.textFile(file)
> >    rdd.persist
> >    println("Counting: " + rdd.count)
> >  }
> >
> >
> > Best regards,
> > Milos
>
>

Mime
View raw message