spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deep Pradhan <pradhandeep1...@gmail.com>
Subject Re: Worker and Nodes
Date Sat, 21 Feb 2015 14:51:35 GMT
Yes, I am talking about standalone single node cluster.

No, I am not increasing parallelism. I just wanted to know if it is
natural. Does message passing across the workers account for the happenning?

I am running SparkKMeans, just to validate one prediction model. I am using
several data sets. I have a standalone mode. I am varying the workers from
1 to 16

On Sat, Feb 21, 2015 at 8:14 PM, Sean Owen <sowen@cloudera.com> wrote:

> I can imagine a few reasons. Adding workers might cause fewer tasks to
> execute locally (?) So you may be execute more remotely.
>
> Are you increasing parallelism? for trivial jobs, chopping them up
> further may cause you to pay more overhead of managing so many small
> tasks, for no speed up in execution time.
>
> Can you provide any more specifics though? you haven't said what
> you're running, what mode, how many workers, how long it takes, etc.
>
> On Sat, Feb 21, 2015 at 2:37 PM, Deep Pradhan <pradhandeep1991@gmail.com>
> wrote:
> > Hi,
> > I have been running some jobs in my local single node stand alone
> cluster. I
> > am varying the worker instances for the same job, and the time taken for
> the
> > job to complete increases with increase in the number of workers. I
> repeated
> > some experiments varying the number of nodes in a cluster too and the
> same
> > behavior is seen.
> > Can the idea of worker instances be extrapolated to the nodes in a
> cluster?
> >
> > Thank You
>

Mime
View raw message