Hi,

 

I am having list Students and size is one Lakh and I am trying to save the file. It is throwing null pointer exception.

 

JavaRDD<Student> distData = sc.parallelize(list);

             

distData.saveAsTextFile("hdfs://master/data/spark/instruments.txt");

 

 

14/11/18 01:33:21 WARN scheduler.TaskSetManager: Lost task 5.0 in stage 0.0 (TID 5, master): java.lang.NullPointerException:

        org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1158)

        org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1158)

        scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

        org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:984)

        org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:974)

        org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)

        org.apache.spark.scheduler.Task.run(Task.scala:54)

        org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)

        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

        java.lang.Thread.run(Thread.java:745)

 

 

How to handle this?

 

-Naveen