Hi all,

Is the sort order guaranteed if you apply operations like map(), filter() or distinct() after sort in a distributed setting (run on a cluster of machines backed by HDFS)? In other words, does rdd.sortByKey().map() have the same sort order as rdd.sortByKey()? If so, is it documented somewhere which operations preserve sort order and which don't?

Thanks,
Mingyu