spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ankurjain.nitrr" <ankurjain.ni...@gmail.com>
Subject Re: can't union two rdds
Date Tue, 31 Mar 2015 13:53:30 GMT
Rdd union will result in  

  1 2 
  3 4 
  5 6 
  7 8 
  9 10 
11 12

What you are trying to do is join.
There must be a logic/key to perform join operation.

I think in your case you want the order (index) to be the joining key here.
RDD is a distributed data structure and is not apt for your case.

If that amount for data is less, you can use rdd.collect, just iterate on it
both the list and produce the desired result



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/can-t-union-two-rdds-tp22320p22323.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message