spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davies Liu <dav...@databricks.com>
Subject Re: CallbackServer in PySpark Streaming
Date Thu, 12 Feb 2015 01:50:01 GMT
Yes.

On Wed, Feb 11, 2015 at 5:44 PM, Todd Gao <todd.gao.2013+spark@gmail.com> wrote:
> Thanks Davies.
> I am not quite familiar with Spark Streaming. Do you mean that the compute
> routine of DStream is only invoked in the driver node,
> while only the compute routines of RDD are distributed to the slaves?
>
> On Thu, Feb 12, 2015 at 2:38 AM, Davies Liu <davies@databricks.com> wrote:
>>
>> The CallbackServer is part of Py4j, it's only used in driver, not used
>> in slaves or workers.
>>
>> On Wed, Feb 11, 2015 at 12:32 AM, Todd Gao
>> <todd.gao.2013+spark@gmail.com> wrote:
>> > Hi all,
>> >
>> > I am reading the code of PySpark and its Streaming module.
>> >
>> > In PySpark Streaming, when the `compute` method of the instance of
>> > PythonTransformedDStream is invoked, a connection to the CallbackServer
>> > is created internally.
>> > I wonder where is the CallbackServer for each PythonTransformedDStream
>> > instance on the slave nodes in distributed environment.
>> > Is there a CallbackServer running on every slave node?
>> >
>> > thanks
>> > Todd
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message