spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: Is spark context in local mode thread-safe?
Date Mon, 09 Jun 2014 23:47:06 GMT
No, there’s only one UI per SparkContext.

On Jun 9, 2014, at 4:43 PM, DB Tsai <dbtsai@stanford.edu> wrote:

> What if there are multiple threads using the same spark context, will
> each of thread have it own UI? In this  case, it will quickly run out
> of the ports.
> 
> Thanks.
> 
> Sincerely,
> 
> DB Tsai
> -------------------------------------------------------
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
> 
> 
> On Mon, Jun 9, 2014 at 4:34 PM, Matei Zaharia <matei.zaharia@gmail.com> wrote:
>> You currently can’t have multiple SparkContext objects in the same JVM, but within
a SparkContext, all of the APIs are thread-safe so you can share that context between multiple
threads. The other issue you’ll run into is that in each thread where you want to use Spark,
you need to use SparkEnv.set(env) where “env” was obtained by SparkEnv.get in the thread
that created the context. This requirement will hopefully go away soon.
>> 
>> Unfortunately there’s no way yet to disable the UI — feel free to open a JIRA
for it, it shouldn’t be hard to do.
>> 
>> Matei
>> 
>> On Jun 9, 2014, at 3:50 PM, DB Tsai <dbtsai@stanford.edu> wrote:
>> 
>>> Hi guys,
>>> 
>>> We would like to use spark hadoop api to get the first couple hundred
>>> lines in design time to quickly show users the file-structure/meta
>>> data, and the values in those lines without launching the full spark
>>> job in cluster.
>>> 
>>> Since we're web-based application, there will be multiple users using
>>> the spark hadoop api, for exmaple, sc.textFile(filePath). I wonder if
>>> those APIs are thread-safe in local mode (each user will have its own
>>> SparkContext object).
>>> 
>>> Secondly, it seems that even in local mode, the jetty UI tracker will
>>> be lunched. For this kind of cheap operation, having jetty UI tracker
>>> for each operation will be very expensive. Is there a way to disable
>>> this behavior?
>>> 
>>> Thanks.
>>> 
>>> Sincerely,
>>> 
>>> DB Tsai
>>> -------------------------------------------------------
>>> My Blog: https://www.dbtsai.com
>>> LinkedIn: https://www.linkedin.com/in/dbtsai
>> 


Mime
View raw message