spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Corey Nolet <cjno...@gmail.com>
Subject Re: Grouping elements in a RDD
Date Sun, 21 Jun 2015 01:40:19 GMT
If you use rdd.mapPartitions(), you'll be able to get a hold of the
iterators for each partiton. Then you should be able to do
iterator.grouped(size) on each of the partitions. I think it may mean you
have 1 element at the end of each partition that may have less than "size"
elements. If that's okay for you then that should work.

On Sat, Jun 20, 2015 at 7:48 PM, Brandon White <bwwinthehouse@gmail.com>
wrote:

> How would you do a .grouped(10) on a RDD, is it possible? Here is an
> example for a Scala list
>
> scala> List(1,2,3,4).grouped(2).toList
> res1: List[List[Int]] = List(List(1, 2), List(3, 4))
>
> Would like to group n elements.
>

Mime
View raw message