spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chester Chen <chesterxgc...@yahoo.com>
Subject Re: Is spark context in local mode thread-safe?
Date Tue, 10 Jun 2014 00:34:56 GMT
Matei, 
	Thanks for the insight, we have to carefully consider our design. We are in the processing
moving our system to Akka, it would be nice to use Akka all the way. But I understand the
limitations. 

Thanks
Chester


On Monday, June 9, 2014 5:06 PM, Matei Zaharia <matei.zaharia@gmail.com> wrote:
 


In general you probably shouldn’t use actors for processing requests because Spark operations
are blocking, and Akka only has a limited thread pool for each ActorSystem. You risk blocking
all the threads with ongoing requests and not being able to service new ones. That said though,
you can configure Akka to spawn more threads and in that case it would probably be okay. See http://doc.akka.io/docs/akka/snapshot/java/dispatchers.html
for some details on Akka thread usage and how to configure it.

Matei



On Jun 9, 2014, at 4:54 PM, Chester Chen <chesterxgchen@yahoo.com> wrote:

Matei, 
>If we use different Akka actors to process different user's requests, (not different threads),
is the SparkContext still safe to use for different users ? 
>
>
>Yes, it would be nice to disable UI via configuration,especially when we develop locally.
We use sbt-web plugin to debug tomcat code. If we can disable the UI http Server; it would
be much simpler to handle than having two http containers to deal with. 
>
>
>Chester
>
>
>
>
>
>On Monday, June 9, 2014 4:35 PM, Matei Zaharia <matei.zaharia@gmail.com> wrote:
> 
>
>
>You currently can’t have multiple SparkContext objects in the same JVM, but within a
SparkContext, all of the APIs are thread-safe so you can share that context between multiple
threads. The other issue you’ll run into is that in each thread where you want to use Spark,
you need to use SparkEnv.set(env) where “env” was obtained by SparkEnv.get in the thread
that created the context. This requirement will hopefully go away soon.
>
>Unfortunately there’s no way yet to disable the UI — feel free to open a JIRA for
it, it shouldn’t be hard to do.
>
>Matei
>
>
>On Jun 9, 2014, at 3:50 PM, DB Tsai <dbtsai@stanford.edu> wrote:
>
>> Hi guys,
>> 
>> We would like to use spark hadoop api to get the first couple hundred
>> lines in design time to quickly show users the file-structure/meta
>> data, and the values in those lines without launching the full spark
>> job in cluster.
>> 
>> Since we're web-based application, there will be multiple users using
>> the spark hadoop api, for
 exmaple, sc.textFile(filePath). I wonder if
>> those APIs are thread-safe in local mode (each user will have its own
>> SparkContext object).
>> 
>> Secondly, it seems that even in local mode, the jetty UI tracker will
>> be lunched. For this kind of cheap operation, having jetty UI tracker
>> for each operation will be very expensive. Is there a way to disable
>> this behavior?
>> 
>> Thanks.
>> 
>> Sincerely,
>> 
>> DB Tsai
>> -------------------------------------------------------
>> My Blog: https://www.dbtsai.com
>> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
>
Mime
View raw message