spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Albert <m_albert...@yahoo.com.INVALID>
Subject BUG: when running as "extends App", closures don't capture variables
Date Wed, 29 Oct 2014 22:16:36 GMT
Greetings!
This might be a documentation issue as opposed to a coding issue, in that perhaps the correct
answer is "don't do that", but as this is not obvious, I am writing.
The following code produces output most would not expect:
package misc
import org.apache.spark.SparkConfimport org.apache.spark.SparkContextimport org.apache.spark.SparkContext._
object DemoBug extends App {    val conf = new SparkConf()    val sc = new SparkContext(conf)
    val rdd = sc.parallelize(List("A","B","C","D"))    val str1 = "A"
    val rslt1 = rdd.filter(x => { x != "A" }).count    val rslt2 = rdd.filter(x =>
{ str1 != null && x != "A" }).count        println("DemoBug: rslt1 = " + rslt1
+ " rslt2 = " + rslt2)}
This produces the output:DemoBug: rslt1 = 3 rslt2 = 0
Compiled with sbt:libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.1.0"Run
on an EC2 EMR instance with a recent image (hadoop 2.4.0, spark 1.1.0)
If instead there is a proper "main()", it works as expected.
Thank you.
Sincerely, Mike
Mime
View raw message