spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject Re: spark 1.6 foreachPartition only appears to be running on one executor
Date Fri, 11 Mar 2016 18:24:04 GMT
Hi,

How do you check which executor is used? Can you include a screenshot of
the master's webUI with workers?

Jacek
11.03.2016 6:57 PM "Darin McBeath" <ddmcbeath@yahoo.com.invalid> napisaƂ(a):

> I've run into a situation where it would appear that foreachPartition is
> only running on one of my executors.
>
> I have a small cluster (2 executors with 8 cores each).
>
> When I run a job with a small file (with 16 partitions) I can see that the
> 16 partitions are initialized but they all appear to be initialized on only
> one executor.  All of the work then runs on this  one executor (even though
> the number of partitions is 16). This seems odd, but at least it works.
> Not sure why the other executor was not used.
>
> However, when I run a larger file (once again with 16 partitions) I can
> see that the 16 partitions are initialized once again (but all on the same
> executor).  But, this time subsequent work is now spread across the 2
> executors.  This of course results in problems because the other executor
> was not initialized as all of the partitions were only initialized on the
> other executor.
>
> Does anyone have any suggestions for where I might want to investigate?
> Has anyone else seen something like this before?  Any thoughts/insights
> would be appreciated.  I'm using the Stand Alone Cluster manager, cluster
> started with the spark ec2 scripts  and submitting my job using
> spark-submit.
>
> Thanks.
>
> Darin.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message