spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Imran Rashid <>
Subject Re: can spark take advantage of ordered data?
Date Thu, 12 Mar 2015 01:05:03 GMT
Hi Jonathan,

you might be interested in
(not yet available) and (not part
of spark, but it is available right now).  Hopefully thats what you are
looking for.  To the best of my knowledge that covers what is available now
/ what is being worked on.


On Wed, Mar 11, 2015 at 4:38 PM, Jonathan Coveney <>

> Hello all,
> I am wondering if spark already has support for optimizations on sorted
> data and/or if such support could be added (I am comfortable dropping to a
> lower level if necessary to implement this, but I'm not sure if it is
> possible at all).
> Context: we have a number of data sets which are essentially already
> sorted on a key. With our current systems, we can take advantage of this to
> do a lot of analysis in a very efficient fashion...merges and joins, for
> example, can be done very efficiently, as can folds on a secondary key and
> so on.
> I was wondering if spark would be a fit for implementing these sorts of
> optimizations? Obviously it is sort of a niche case, but would this be
> achievable? Any pointers on where I should look?

View raw message