My Spark app is mapping lines from a text file to case classes stored within an RDD.


When I run the following code on this rdd:

.collect.map(line => if(validate_hostname(line, data_frame)) line).foreach(println)


It correctly calls the method validate_hostname by passing the case class and another data_frame defined within the main method. Unfortunately the above map only returns a TraversableLike collection so I can’t do transformations and joins on this data structure so I’m tried to apply a filter on the rdd with the following code:

.filter(line => validate_hostname(line, data_frame)).count()


Unfortunately the above method with filtering the rdd does not pass the data_frame so I get a NullPointerException though it correctly passes the case class which I print within the method.


Where am I going wrong?





