spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiang Huo <huoxiang5...@gmail.com>
Subject How to filter a sorted RDD
Date Mon, 04 Nov 2013 06:42:50 GMT
Hi all,

I am trying to filter a smaller RDD data set from a large RDD data set. And
the large one is sorted. So my question is that is there any way to make
the filter method does't check every element in RDD but filter out all the
other elements when one element doesn't meet the condition of filter.
Because the large data set is sorted, when there is one element doesn't
meet the requirement, all the following elements are impossible to meet.
But checking them one by one will take a relative long time.
So is there any way to save time for this part?

Thanks,

Xiang

-- 
Xiang Huo
Department of Computer Science
University of Illinois at Chicago(UIC)
Chicago, Illinois
US
Email: huoxiang5659@gmail.com
           or xhuo4@uic.edu

Mime
View raw message