Thanks for reporting back. Glad it worked for you. Actually sum with partitioning behaviour is same in oracle too.

I'd seen that already, but I was trying to avoid using rdds to perform this calculation.

@Ayan, it seems I was mistaken, and doing a sum(b) over(order by b) totally works. I guess I expected the windowing with sum to work more like oracle. Thanks for the suggestion :)

I don't think that would work properly, and would probably just give me the sum for each partition. I'll give it a try when I get home just to be certain.

To maybe explain the intent better, if I have a column (pre sorted) of (1,2,3,4), then the cumulative sum would return (1,3,6,10).

Does that make sense? Naturally, if ordering a sum turns it into a cumulative sum, I'll gladly use that :)

You mean you are not able to use sum(col) over (partition by key order by some_col) ?

