spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ewan Higgs <ewan.hi...@ugent.be>
Subject Re: Custom Cluster Managers / Standalone Recovery Mode in Spark
Date Sun, 01 Feb 2015 15:31:34 GMT
+1
On a related note, there is a lot of interest in Hadoop and Spark from 
the HPC community who often run slurm, pbs, and sge to control jobs (as 
opposed to Yarn and Mesos). Currently, there are several projects that 
launch Yarn clusters (or MR1 clusters) inside PBS Jobs [1] but this is 
not the ideal situation. It would be much better to spark-submit 
pbs://master.whatever.org ... and run the job directly.

I would also appreciate help on how to move forward on such a project 
for Spark since it has the performance benefits over Hadoop and I don't 
think Hadoop can currently be disentangled from Yarn at the moment.

I think I need to define a new PbsExecutorBackend and 
PbsSchedulerBackend. IPython approaches this by writing a job script to 
shell out to command line tools like qsub, qdel, qstat because most of 
the job schedulers use these command line tools as a front end [2]. Then 
we should be able to get slurm, pbs, and sge in one shot rather than 
implementing some wire formats for RPC.

Thanks,
Ewan Higgs

[1] https://hadoop.apache.org/docs/r1.2.1/hod_scheduler.html
https://github.com/glennklockwood/hpchadoop
http://jaliyacgl.blogspot.be/2008/08/hadoop-as-batch-job-using-pbs.html
https://github.com/hpcugent/hanythingondemand

[2] http://ipython.org/ipython-doc/stable/parallel/parallel_process.html
https://github.com/ipython/ipython/blob/master/IPython/parallel/apps/launcher.py#L1150

On 31/01/15 09:55, Anjana Fernando wrote:
> Hi everyone,
>
> I've been experimenting, and somewhat of a newbie for Spark. I was
> wondering, if there is any way, that I can use a custom cluster manager
> implementation with Spark. Basically, as I understood, at the moment, the
> inbuilt modes supported are with standalone, Mesos and  Yarn. My
> requirement is basically a simple clustering solution with high
> availability of the master. I don't want to use a separate Zookeeper
> cluster, since this would complicate my deployment, but rather, I would
> like to use something like Hazelcast, which has a peer-to-peer cluster
> coordination implementation.
>
> I found that, there is already this JIRA [1], which requests for a custom
> persistence engine, I guess for storing state information. So basically,
> what I would want to do is, use Hazelcast to use for leader election, to
> make an existing node the master, and to lookup the state information from
> the distributed memory. Appreciate any help on how to archive this. And if
> it useful for a wider audience, hopefully I can contribute this back to the
> project.
>
> [1] https://issues.apache.org/jira/browse/SPARK-1180
>
> Cheers,
> Anjana.
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message