spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrian Mocanu <amoc...@verticalscope.com>
Subject remove duplicates
Date Mon, 24 Mar 2014 16:44:59 GMT
I have a DStream like this:
..RDD[a,b],RDD[b,c]..

Is there a way to remove duplicates across the entire DStream? Ie: I would like the output
to be (by removing one of the b's):
..RDD[a],RDD[b,c]..  or ..RDD[a,b],RDD[c]..

Thanks
-Adrian


Mime
View raw message