spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shixiong Zhu <zsxw...@gmail.com>
Subject Re: Bug in Accumulators...
Date Fri, 07 Nov 2014 08:03:25 GMT
Could you provide all pieces of codes which can reproduce the bug? Here is
my test code:

import org.apache.spark._
import org.apache.spark.SparkContext._

object SimpleApp {

  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("SimpleApp")
    val sc = new SparkContext(conf)

    val accum = sc.accumulator(0)
    for (i <- 1 to 10) {
      sc.parallelize(Array(1, 2, 3, 4)).foreach(x => accum += x)
    }
    sc.stop()
  }
}

It works fine both in client and cluster. Since this is a serialization
bug, the outer class does matter. Could you provide it? Is there
a SparkContext field in the outer class?

Best Regards,
Shixiong Zhu

2014-10-28 0:28 GMT+08:00 octavian.ganea <octavian.ganea@inf.ethz.ch>:

> I am also using spark 1.1.0 and I ran it on a cluster of nodes (it works
> if I
> run it in local mode! )
>
> If I put the accumulator inside the for loop, everything will work fine. I
> guess the bug is that an accumulator can be applied to JUST one RDD.
>
> Still another undocumented 'feature' of Spark that no one from the people
> who maintain Spark is willing to solve or at least to tell us about ...
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Bug-in-Accumulators-tp17263p17372.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message