spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Holden Karau <hol...@pigscanfly.ca>
Subject Re: Partitioning to speed up processing?
Date Thu, 10 Mar 2016 19:55:39 GMT
Are they entire data set aggregates or is there some grouping applied?

On Thursday, March 10, 2016, Gerhard Fiedler <gfiedler@algebraixdata.com>
wrote:

> I have a number of queries that result in a sequence Filter > Project >
> Aggregate. I wonder whether partitioning the input table makes sense.
>
>
>
> Does Aggregate benefit from a partitioned input? If so, what partitions
> would be most useful (related to the aggregations)?
>
>
>
> Do Filter and Project preserve the partition of its inputs?
>
>
>
> Thanks,
>
> Gerhard
>
>
>


-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Mime
View raw message