spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerard Maas <>
Subject Re: Appending an incrental value to each RDD record
Date Tue, 16 Dec 2014 15:40:13 GMT
You would do:

rdd.zipWithIndex    Gives you  an RDD[Original, Int] where the second
element is the index.
To have a (index,original) tuple, you will need to map that previous RDD to
the desired shape:

-kr, Gerard.

kr, Gerard.

On Tue, Dec 16, 2014 at 4:12 PM, bethesda <> wrote:
> I think this is sort of a newbie question, but I've checked the api closely
> and don't see an obvious answer:
> Given an RDD, how would I create a new RDD of Tuples where the first Tuple
> value is an incremented Int e.g. 1,2,3 ... and the second value of the
> Tuple
> is the original RDD record?  I'm trying to simply assign a unique ID to
> each
> record in my RDD.  (I want to stay in RDD land, and not convert to a List
> and back to RDD, since that seems unnecessary and probably bad form.)
> Thanks.
> --
> View this message in context:
> Sent from the Apache Spark User List mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View raw message