spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gylfi <gy...@berkeley.edu>
Subject Re: Flatten list
Date Sat, 18 Jul 2015 07:21:47 GMT
Hi. 

To be honest I don't really understand your problem declaration :(  but lets
just talk about how .flatmap works. 
Unlike .map(), that only allows a one-to-one transformation, .flatmap()
allows 0, 1 or many outputs per item processed but the output must take the
form of a sequence of the same type, like a /List/ for example. 
All the sequences will then be merged (i.e. flattened) in the end into a
single RDD of that type. 
Note however that an array does not inherit from Sequence and thus you must
transform it to a Sequence or something that inherits from AbstractSeq, like
a List. 
See 
http://www.scala-lang.org/api/current/index.html#scala.collection.immutable.List 
vs. 
http://www.scala-lang.org/api/current/index.html#scala.Array

For example, lets assume you have an RDD[(Array[Int])] and you want all the
Int values flattened into a single RDD[(Int)]. The code would be something
like so: 

val intArraysRDD : RDD[(Array[Int])] = ..."some code to get array"... 
val flattnedIntRDD : RDD[(Int)] = intArraysRDD.flatmap( array => {
    var ret : List[(Int)] = nil 
    for ( i <- array) {
        ret = i :: ret
    }
    ret
}) 

This is an intentionally explicit version.. 
A simpler could would be something like this .. 
val flattnedIntRDD : RDD[(Int)] = intArraysRDD.flatmap( array =>
array.toList)

However, to understand exactly your problem you need to explain better what
the RDD you want to create should look like.. 
 
Regards, 
   Gylfi. 



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Flatten-list-tp23887p23892.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message