spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <t...@databricks.com>
Subject Re: Master vs. Slave Nodes Clarification
Date Tue, 14 Jul 2015 21:50:37 GMT
Yep :)

On Tue, Jul 14, 2015 at 2:44 PM, algermissen1971 <algermissen1971@icloud.com
> wrote:

>
> On 14 Jul 2015, at 23:26, Tathagata Das <tdas@databricks.com> wrote:
>
> > Just to be clear, you mean the Spark Standalone cluster manager's
> "master" and not the applications "driver", right.
>
> Sorry, by now I have understood that I would not necessarily put the
> driver app on the master node and that not making that distinction made my
> question kind of hard to answer :-)
>
> So far I have understood that for a spark streaming app that uses the
> cassandra connector (and also needs checkpointing):
>
> slaves: need Spark, C*, the connector and access to a distributed file
> system for the checkpointing
> master: needs Spark (configured as master) but none of the rest
> the node where the driver runs: needs spark,  C*, the connector and access
> to a distributed file system for the checkpointing
>
> Correct?
>
> (And thanks to everyone for the replies)
>
>
> Jan
>
>
>
> > In that case, the earlier responses are correct.
> >
> > TD
> >
> > On Tue, Jul 14, 2015 at 11:26 AM, Mohammed Guller <
> mohammed@glassbeam.com> wrote:
> > The master node does not have to be similar to the worker nodes. It can
> be a smaller machine.
> >
> > In case of C*, again you don't need to have C* on the master node. You
> need C* and Spark workers co-located. Master can be on one of the C* node
> or a non-C* node.
> >
> > Mohammed
> >
> >
> > -----Original Message-----
> > From: algermissen1971 [mailto:algermissen1971@icloud.com]
> > Sent: Sunday, July 12, 2015 12:35 PM
> > To: Spark User
> > Subject: Master vs. Slave Nodes Clarification
> >
> > Hi,
> >
> > I have a question that I really have problems with figuring out for
> myself:
> >
> > Does the master node in a spark cluster need to be a node similar to the
> slave nodes or should I rather view it as a coordinating node, that does
> not need much computing or storage power?
> >
> > For example, when using Spark Streaming and Checkpointing, would the
> master node need access to the shared file system (e.g. HDFS)? Or do I only
> need to mount that on the slaves?
> > (likewise, if I use the Cassandra-Connector, does that (and C*) need to
> be installed on the master node, too?)
> >
> > Or, in other words: is the master just one node of similar cluster
> nodes, or is it merely a 'small control node', for which sort of any small
> VM would do?
> >
> > Jan
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org For
> additional commands, e-mail: user-help@spark.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> > For additional commands, e-mail: user-help@spark.apache.org
> >
> >
>
>

Mime
View raw message