spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filip Andrei <>
Subject Odd error when using a rdd map within a stream map
Date Thu, 18 Sep 2014 13:57:21 GMT
here i wrote a simpler version of the code to get an understanding of how it

final List<NeuralNet> nns = new ArrayList<NeuralNet>(); 
for(int i = 0; i < numberOfNets; i++){ 

final JavaRDD<NeuralNet> nnRdd = sc.parallelize(nns);	
JavaDStream<Float> results = rndLists.flatMap(new
FlatMapFunction<Map&lt;String,Object>, Float>() { 
public Iterable<Float> call(Map<String, Object> input) 
throws Exception { 

Float f = Function<NeuralNet, Float>() { 

public Float call(NeuralNet nn) throws Exception { 

return 1.0f; 
}).reduce(new Function2<Float, Float, Float>() { 

public Float call(Float left, Float right) throws Exception { 

return left + right; 

return Arrays.asList(f); 

This works as expected and print() simply shows the number of neural nets i
If instead a print() i use

results.foreach(new Function<JavaRDD&lt;Float>, Void>() { 

public Void call(JavaRDD<Float> arg0) throws Exception { 

for(Float f : arg0.collect()){ 
return null; 

It fails with the following exception
org.apache.spark.SparkException: Job aborted due to stage failure: Task
1.0:0 failed 1 times, most recent failure: Exception failure in TID 1 on
host localhost: java.lang.NullPointerException

This is weird to me since the same code executes as expected in one case and
doesn't in the other, any idea what's going on here ?

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message