spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lihu <lihu...@gmail.com>
Subject Re: Connection Error Problem
Date Mon, 02 Dec 2013 15:05:55 GMT
I also occurred this problem in Spark0.8.
Here is the SparkLR  code. I only change the N and D in two different
experiment.


  *val N = 10000  // Number of data points*
*  val D = 100   // Numer of dimensions*
*  val R = 0.7  // Scaling factor*
*  val ITERATIONS = 10*
*  val rand = new Random(42)*

*   case class DataPoint(x: Vector, y: Double)*

*  def generateData = {*
*    def generatePoint(i: Int) = {*
*      val y = if(i % 2 == 0) -1 else 1*
*      val x = Vector(D, _ => rand.nextGaussian + y * R)*
*      DataPoint(x, y)*
*    }*
*    Array.tabulate(N)(generatePoint)*
*  }*

*    val numSlices = 4 *
*    val points = sc.parallelize(generateData, numSlices).cache()*

 *    var w = Vector(D, _ => 2 * rand.nextDouble - 1)*

 *    for (i <- 1 to ITERATIONS) {*
*      println("On iteration " + i)*
*      val gradient = points.map { p =>*
*        (1 / (1 + exp(-p.y * (w dot p.x))) - 1) * p.y * p.x*
*      }.reduce(_ + _)*
*      w -= gradient*
*    }*



On Mon, Dec 2, 2013 at 11:00 PM, lihu <lihu723@gmail.com> wrote:

> Hi,
>
>        I run the SparkLR example in the spark0.9.  When I run a small set
> of data, about 8M, it succeed. but when I run about 800M data, it occurred
> the  below Connection problem.
>
>       At first, I thought this due to the hadoop file,  because I do not
> set the hadoop file, but it can pass in the small size data, so this is
> not. I search the net, find a similar problem<http://mail-archives.apache.org/mod_mbox/spark-user/201310.mbox/%3CCANN3bXb2-BbcAa=ngCCd3Vvtcsx53MveQ4K8ZRXQG1BpRzjdww@mail.gmail.com%3E>,
> but no one answer this, *so I think this maybe a meaningful question*.
> Can anyone give me some tips? thanks in advance.
>
>       my environment is 10 works and 1 master,  and here is my configure.
>
>      *SPARK_WORKER_MEMORY=20g*
> *      SPARK_JAVA_OPTS+="-Dspark.executor.memory=8g
> -Dspark.local.dir=/disk3/ djvv/tmp -Dspark.akka.timeout=60
> -Dspark.worker.timeout=80 -Dspark.akka.frameSize=10000 -Xms30G -Xmx30G
> -XX:+UseConcMarkSweepGC -XX:-UseGCOverheadLimit"*
>
>
> org.apache.spark.SparkException: Error notifying standalone scheduler's
> driver actor
>         at
> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:223)
>         at
> org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.executorRemoved(SparkDeploySchedulerBackend.scala:96)
>         at
> org.apache.spark.deploy.client.Client$ClientActor$$anonfun$receive$1.apply(Client.scala:133)
>         at
> org.apache.spark.deploy.client.Client$ClientActor$$anonfun$receive$1.apply(Client.scala:111)
>         at akka.actor.Actor$class.apply(Actor.scala:318)
>         at
> org.apache.spark.deploy.client.Client$ClientActor.apply(Client.scala:59)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:626)
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:197)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:179)
>         at
> akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516)
>         at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259)
>         at
> akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)
>         at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
>         at
> akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
> Caused by: java.util.concurrent.TimeoutException: Futures timed out after
> [10000] milliseconds
>         at akka.dispatch.DefaultPromise.ready(Future.scala:870)
>         at akka.dispatch.DefaultPromise.ready(Future.scala:847)
>         at akka.dispatch.Await$.ready(Future.scala:64)
>         at
> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:220)
>
>    [sparkMaster-akka.actor.default-dispatcher-9] ERROR
> akka.actor.ActorSystemImpl - RemoteClientError@akka://
> spark@080.xxx.yyyy.com:17913 <http://spark@qt080.corp.yodao.com:17913/>:
> Error[java.net.ConnectException:Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>         at
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:404)
>         at
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:366)
>         at
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:282)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> ]
>       [sparkMaster-akka.actor.default-dispatcher-9] ERROR
> akka.actor.ActorSystemImpl - RemoteClientError@akka://
> spark@080.xxx.yyyy.com:17913 <http://spark@qt080.corp.yodao.com:17913/>:
> Error[java.nio.channels.ClosedChannelException:null
>
>
>
>
>
>
> --
> Best Wishes!
>
>
>
>
>


-- 
Best Wishes!

Li Hu(李浒) | Graduate Student
Institute for Interdisciplinary Information
Sciences(IIIS<http://iiis.tsinghua.edu.cn/>
)
Tsinghua University, China

Email: lihu723@gmail.com
QQ   :790441432
Tel  : +86 15120081920

Mime
View raw message