spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Victor Tso-Guillen <v...@paxata.com>
Subject Re: What is a Block Manager?
Date Wed, 27 Aug 2014 17:28:39 GMT
I have long-lived state I'd like to maintain on the executors that I'd like
to initialize during some bootstrap phase and to update the master when
such executor leaves the cluster.


On Tue, Aug 26, 2014 at 11:18 PM, Liu, Raymond <raymond.liu@intel.com>
wrote:

> The framework have those info to manage cluster status, and these info
> (e.g. worker number) is also available through spark metrics system.
> While from the user application's point of view, can you give an example
> why you need these info, what would you plan to do with them?
>
> Best Regards,
> Raymond Liu
>
> From: Victor Tso-Guillen [mailto:vtso@paxata.com]
> Sent: Wednesday, August 27, 2014 1:40 PM
> To: Liu, Raymond
> Cc: user@spark.apache.org
> Subject: Re: What is a Block Manager?
>
> We're a single-app deployment so we want to launch as many executors as
> the system has workers. We accomplish this by not configuring the max for
> the application. However, is there really no way to inspect what
> machines/executor ids/number of workers/etc is available in context? I'd
> imagine that there'd be something in the SparkContext or in the listener,
> but all I see in the listener is block managers getting added and removed.
> Wouldn't one care about the workers getting added and removed at least as
> much as for block managers?
>
> On Tue, Aug 26, 2014 at 6:58 PM, Liu, Raymond <raymond.liu@intel.com>
> wrote:
> Basically, a Block Manager manages the storage for most of the data in
> spark, name a few: block that represent a cached RDD partition,
> intermediate shuffle data, broadcast data etc. it is per executor, while in
> standalone mode, normally, you have one executor per worker.
>
> You don't control how many worker you have at runtime, but you can somehow
> manage how many executors your application will launch  Check different
> running mode's documentation for details  ( but control where? Hardly, yarn
> mode did some works based on data locality, but this is done by framework
> not user program).
>
> Best Regards,
> Raymond Liu
>
> From: Victor Tso-Guillen [mailto:vtso@paxata.com]
> Sent: Tuesday, August 26, 2014 11:42 PM
> To: user@spark.apache.org
> Subject: What is a Block Manager?
>
> I'm curious not only about what they do, but what their relationship is to
> the rest of the system. I find that I get listener events for n block
> managers added where n is also the number of workers I have available to
> the application. Is this a stable constant?
>
> Also, are there ways to determine at runtime how many workers I have and
> where they are?
>
> Thanks,
> Victor
>
>

Mime
View raw message