spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: How to merge a RDD of RDDs into one uber RDD
Date Wed, 07 Jan 2015 10:49:20 GMT
I think you mean union(). Yes, you could also simply make an RDD for each
file, and use SparkContext.union() to put them together.

On Wed, Jan 7, 2015 at 9:51 AM, Raghavendra Pandey <
raghavendra.pandey@gmail.com> wrote:

> You can also use join function of rdd. This is actually kind of append
> funtion that add up all the rdds and create one uber rdd.
>
> On Wed, Jan 7, 2015, 14:30 rkgurram <rkgurram@gmail.com> wrote:
>
>> Thank you for the response, sure will try that out.
>>
>> Currently I changed my code such that the first map "files.map" to
>> "files.flatMap", which I guess will do similar what you are saying, it
>> gives
>> me a List[] of elements (in this case LabeledPoints, I could also do RDDs)
>> which I then turned into a mega RDD. The current problem seems to be
>> gone, I
>> no longer get the NPE but further down I am getting a indexOutOfBounds, so
>> trying to figure out if the original problem is manifesting itself as a
>> new
>> one.
>>
>>
>> Regards
>> -Ravi
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/How-to-merge-a-RDD-of-RDDs-into-one-
>> uber-RDD-tp20986p21012.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>

Mime
View raw message