spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kai Chen <>
Subject Re: Add hot-deploy capability in Spark Shell
Date Mon, 06 Jun 2016 23:24:37 GMT
I don't.  The hot-deploy shouldn't happen while there is a job running.  At
least in the REPL it won't make much sense.  It's a development-only
feature to shorten the iterative coding cycle.  In production environment,
this is not enabled ... though there might be situations where it would be
desirable.  But currently I'm not handling that, as it's much more complex.

On Mon, Jun 6, 2016 at 4:16 PM, Reynold Xin <> wrote:

> Thanks for the email. How do you deal with in-memory state that reference
> the classes? This can happen in both streaming and caching in RDD and
> temporary view creation in SQL.
> On Mon, Jun 6, 2016 at 3:40 PM, S. Kai Chen <>
> wrote:
>> Hi,
>> We use spark-shell heavily for ad-hoc data analysis as well as iterative
>> development of the analytics code. A common workflow consists the following
>> steps:
>>    1. Write a small Scala module, assemble the fat jar
>>    2. Start spark-shell with the assembly jar file
>>    3. Try out some ideas in the shell, then capture the code back into
>>    the module
>>    4. Go back to step 1 and restart the shell
>> This is very similar to what people do in web-app development. And the
>> pain point is similar: in web-app development, a lot of time is spent
>> waiting for new code to be deployed; here, a lot of time is spent waiting
>> for Spark to restart. Having the ability to hot-deploy code in the REPL
>> would help a lot, just as being able to hot-deploy in containers like Play,
>> or using JRebel, has helped boost productivity tremendously.
>> I do have code that works with the 1.5.2 release.  Is this something
>> that's interesting enough to be included in Spark proper?  If so, should I
>> create a Jira ticket or github PR for the master branch?
>> Cheers,
>> Kai

View raw message