spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From myasuka <>
Subject Re: Why recommend 2-3 tasks per CPU core ?
Date Wed, 24 Sep 2014 14:12:02 GMT
Thank for your reply!  Specifically, in our developing environment, I want to
know is there any solution to let every tak do just once sub-matrices

After 'groupBy', every node with 16 cores should get 16 pair matrices, thus
I set every node with 16 tasks to hope every core do once matrix
multiplication, however some task do once, some task do none while some do
twice. The performance bottlelneck is that not every core do the same amount
of work, the one which does more would need more time.

Our algorithm evolves from a MPI version which assume every core in the grid
do the same amount of work.

Andrew Ash wrote
> Also you'd rather have 2-3 tasks per core than 1 task per core because if
> the 1 task per core is actually 1.01 tasks per core, then you have one
> wave
> of tasks complete and another wave of tasks with very few tasks in them.
> You get better utilization when you're higher than 1.
> Aaron Davidson goes into this more somewhere in this talk --
> On Mon, Sep 22, 2014 at 11:52 PM, Nicholas Chammas <

> nicholas.chammas@

>> wrote:
>> On Tue, Sep 23, 2014 at 1:58 AM, myasuka &lt;

> myasuka@

> &gt; wrote:
>>> Thus I want to know why  recommend
>>> 2-3 tasks per CPU core?
>> You want at least 1 task per core so that you fully utilize the cluster's
>> parallelism.
>> You want 2-3 tasks per core so that tasks are a bit smaller than they
>> would otherwise be, making them shorter and more likely to complete
>> successfully.
>> Nick

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message