spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gylfi <>
Subject Re: Flatten list
Date Sat, 18 Jul 2015 07:21:47 GMT

To be honest I don't really understand your problem declaration :(  but lets
just talk about how .flatmap works. 
Unlike .map(), that only allows a one-to-one transformation, .flatmap()
allows 0, 1 or many outputs per item processed but the output must take the
form of a sequence of the same type, like a /List/ for example. 
All the sequences will then be merged (i.e. flattened) in the end into a
single RDD of that type. 
Note however that an array does not inherit from Sequence and thus you must
transform it to a Sequence or something that inherits from AbstractSeq, like
a List. 

For example, lets assume you have an RDD[(Array[Int])] and you want all the
Int values flattened into a single RDD[(Int)]. The code would be something
like so: 

val intArraysRDD : RDD[(Array[Int])] = ..."some code to get array"... 
val flattnedIntRDD : RDD[(Int)] = intArraysRDD.flatmap( array => {
    var ret : List[(Int)] = nil 
    for ( i <- array) {
        ret = i :: ret

This is an intentionally explicit version.. 
A simpler could would be something like this .. 
val flattnedIntRDD : RDD[(Int)] = intArraysRDD.flatmap( array =>

However, to understand exactly your problem you need to explain better what
the RDD you want to create should look like.. 

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message