I think you mean union(). Yes, you could also simply make an RDD for each file, and use SparkContext.union() to put them together.

On Wed, Jan 7, 2015 at 9:51 AM, Raghavendra Pandey <raghavendra.pandey@gmail.com> wrote:

You can also use join function of rdd. This is actually kind of append funtion that add up all the rdds and create one uber rdd.

On Wed, Jan 7, 2015, 14:30 rkgurram <rkgurram@gmail.com> wrote:
Thank you for the response, sure will try that out.

Currently I changed my code such that the first map "files.map" to
"files.flatMap", which I guess will do similar what you are saying, it gives
me a List[] of elements (in this case LabeledPoints, I could also do RDDs)
which I then turned into a mega RDD. The current problem seems to be gone, I
no longer get the NPE but further down I am getting a indexOutOfBounds, so
trying to figure out if the original problem is manifesting itself as a new


View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-merge-a-RDD-of-RDDs-into-one-uber-RDD-tp20986p21012.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org