spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghavendra Pandey <raghavendra.pan...@gmail.com>
Subject Re: How to merge a RDD of RDDs into one uber RDD
Date Wed, 07 Jan 2015 15:16:08 GMT
Yup, i meant union only.

On Wed, Jan 7, 2015, 16:19 Sean Owen <sowen@cloudera.com> wrote:

> I think you mean union(). Yes, you could also simply make an RDD for each
> file, and use SparkContext.union() to put them together.
>
> On Wed, Jan 7, 2015 at 9:51 AM, Raghavendra Pandey <
> raghavendra.pandey@gmail.com> wrote:
>
>> You can also use join function of rdd. This is actually kind of append
>> funtion that add up all the rdds and create one uber rdd.
>>
>> On Wed, Jan 7, 2015, 14:30 rkgurram <rkgurram@gmail.com> wrote:
>>
>>> Thank you for the response, sure will try that out.
>>>
>>> Currently I changed my code such that the first map "files.map" to
>>> "files.flatMap", which I guess will do similar what you are saying, it
>>> gives
>>> me a List[] of elements (in this case LabeledPoints, I could also do
>>> RDDs)
>>> which I then turned into a mega RDD. The current problem seems to be
>>> gone, I
>>> no longer get the NPE but further down I am getting a indexOutOfBounds,
>>> so
>>> trying to figure out if the original problem is manifesting itself as a
>>> new
>>> one.
>>>
>>>
>>> Regards
>>> -Ravi
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-spark-user-list.
>>> 1001560.n3.nabble.com/How-to-merge-a-RDD-of-RDDs-into-one-
>>> uber-RDD-tp20986p21012.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>>
>

Mime
View raw message