spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Diamant <diamant.mich...@gmail.com>
Subject Non-interactive job fails to copy local variable to remote machines
Date Wed, 29 Jan 2014 20:35:07 GMT
My team recently began writing Spark jobs to be deployed to a Spark cluster
in the form of a jar.  Previously, my team interacted with Spark via the
REPL.  The job in question works within the REPL, but fails when executed
non-interactively (i.e. packaged as a jar).

The job code looks similar to:
// imports
object Runner extends App {
  val end = new DateTime()
  // additional setup
  someRdd.filter(f => end.isAfter(f.date)
}

The point of this example is that a value, end, is defined local to the
driver.  Later in the program's execution, the locally defined value, end,
is referenced in the filter predicate of an RDD.  When running
non-interactively, a NPE occurs when 'end' is referenced in the filter
predicate.  However, running the exact same code via the REPL executes
successfully.

Spark environment details are:
Spark version:  v0.9 using commit SHA
e2ebc3a9d8bca83bf842b134f2f056c1af0ad2be
Scala version: v2.9.3

I appreciate any help in identifying bugs/mistakes made.

Thank you,
Michael

Mime
View raw message