spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: BUG: when running as "extends App", closures don't capture variables
Date Wed, 29 Oct 2014 22:47:26 GMT
Good catch! If you'd like, you can send a pull request changing the files in docs/ to do this
(see https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark <https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark>),
otherwise maybe open an issue on https://issues.apache.org/jira/browse/SPARK <https://issues.apache.org/jira/browse/SPARK>
so we can track it.

Matei

> On Oct 29, 2014, at 3:16 PM, Michael Albert <m_albert137@yahoo.com.INVALID> wrote:
> 
> Greetings!
> 
> This might be a documentation issue as opposed to a coding issue, in that perhaps the
correct answer is "don't do that", but as this is not obvious, I am writing.
> 
> The following code produces output most would not expect:
> 
> package misc
> 
> import org.apache.spark.SparkConf
> import org.apache.spark.SparkContext
> import org.apache.spark.SparkContext._
> 
> object DemoBug extends App {
>     val conf = new SparkConf()
>     val sc = new SparkContext(conf)
> 
>     val rdd = sc.parallelize(List("A","B","C","D"))
>     val str1 = "A"
> 
>     val rslt1 = rdd.filter(x => { x != "A" }).count
>     val rslt2 = rdd.filter(x => { str1 != null && x != "A" }).count
>     
>     println("DemoBug: rslt1 = " + rslt1 + " rslt2 = " + rslt2)
> }
> 
> This produces the output:
> DemoBug: rslt1 = 3 rslt2 = 0
> 
> Compiled with sbt:
> libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.1.0"
> Run on an EC2 EMR instance with a recent image (hadoop 2.4.0, spark 1.1.0)
> 
> If instead there is a proper "main()", it works as expected.
> 
> Thank you.
> 
> Sincerely,
>  Mike


Mime
View raw message