Thank you all for the feedback.  As Josh suggested, the issue was due to extending App.


On Wed, Jan 29, 2014 at 5:57 PM, Josh Rosen <rosenville@gmail.com> wrote:
Try removing the "extends App" and write a "main(args: Array[String])" method instead.  I think that App affects the serialization (there might be some threads about this on the old mailing list).


On Wed, Jan 29, 2014 at 2:54 PM, ɭ <yinxusen@gmail.com> wrote:

Could you give some more details? e.g. the context of your code and the exception stack trace.

Your code seems weired, do you have already new a SparkContext ? REPL will add some necessary components while application would not.

2014-1-30 AM4:35 "Michael Diamant" <diamant.michael@gmail.com

My team recently began writing Spark jobs to be deployed to a Spark cluster in the form of a jar.  Previously, my team interacted with Spark via the REPL.  The job in question works within the REPL, but fails when executed non-interactively (i.e. packaged as a jar).

The job code looks similar to:
// imports
object Runner extends App {
  val end = new DateTime()
  // additional setup
  someRdd.filter(f => end.isAfter(f.date)
}

The point of this example is that a value, end, is defined local to the driver.  Later in the program's execution, the locally defined value, end, is referenced in the filter predicate of an RDD.  When running non-interactively, a NPE occurs when 'end' is referenced in the filter predicate.  However, running the exact same code via the REPL executes successfully.

Spark environment details are:
Spark version:  v0.9 using commit SHA e2ebc3a9d8bca83bf842b134f2f056c1af0ad2be
Scala version: v2.9.3

I appreciate any help in identifying bugs/mistakes made.

Thank you,
Michael