spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Ash <and...@andrewash.com>
Subject Re: spark.default.parallelism
Date Tue, 21 Jan 2014 22:37:54 GMT
Documentation suggestion:

Default number of tasks to use *across the cluster* for distributed shuffle
operations (<code>groupByKey</code>, <code>reduceByKey</code>, etc)
when
not set by user.

Ognen would that have clarified for you?


On Tue, Jan 21, 2014 at 3:35 PM, Matei Zaharia <matei.zaharia@gmail.com>wrote:

> It’s just 4 over the whole cluster.
>
> Matei
>
> On Jan 21, 2014, at 2:27 PM, Ognen Duzlevski <ognen@nengoiksvelzud.com>
> wrote:
>
> This is what docs/configuration.md says about the property:
> " Default number of tasks to use for distributed shuffle operations
> (<code>groupByKey</code>,
>     <code>reduceByKey</code>, etc) when not set by user.
> "
>
> If I set this property to, let's say, 4 - what does this mean? 4 tasks per
> core, per worker, per...? :)
>
> Thanks!
> Ognen
>
>
>

Mime
View raw message