spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ayan guha <guha.a...@gmail.com>
Subject Re: How to add a column to a spark RDD with many columns?
Date Fri, 01 May 2015 05:57:58 GMT
You have rdd or dataframe? Rdds are kind of tuples. You can add a new
column to it by a map.
rdd s are immutable, so you will get another rdd.
On 1 May 2015 14:59, "Carter" <gyzhen@hotmail.com> wrote:

> Hi all,
>
> I have a RDD with *MANY *columns (e.g., *hundreds*), how do I add one more
> column at the end of this RDD?
>
> For example, if my RDD is like below:
>
>     123, 523, 534, ..., 893
>     536, 98, 1623, ..., 98472
>     537, 89, 83640, ..., 9265
>     7297, 98364, 9, ..., 735
>     ......
>     29, 94, 956, ..., 758
>
> how can I efficiently add a column to it, whose value is the sum of the 2nd
> and the 200th columns?
>
> Thank you very much.
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-a-column-to-a-spark-RDD-with-many-columns-tp22729.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message