spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: How to force parallel processing of RDD using multiple thread
Date Fri, 16 Jan 2015 04:03:54 GMT
Check the number of partitions in your input. It may be much less than
the available parallelism of your small cluster. For example, input
that lives in just 1 partition will spawn just 1 task.

Beyond that parallelism just happens. You can see the parallelism of
each operation in the Spark UI.

On Thu, Jan 15, 2015 at 10:53 PM, Wang, Ningjun (LNG-NPV)
<ningjun.wang@lexisnexis.com> wrote:
> Spark Standalone cluster.
>
> My program is running very slow, I suspect it is not doing parallel processing of rdd.
How can I force it to run parallel? Is there anyway to check whether it is processed in parallel?
>
> Regards,
>
> Ningjun Wang
> Consulting Software Engineer
> LexisNexis
> 121 Chanlon Road
> New Providence, NJ 07974-1541
>
>
> -----Original Message-----
> From: Sean Owen [mailto:sowen@cloudera.com]
> Sent: Thursday, January 15, 2015 4:29 PM
> To: Wang, Ningjun (LNG-NPV)
> Cc: user@spark.apache.org
> Subject: Re: How to force parallel processing of RDD using multiple thread
>
> What is your cluster manager? For example on YARN you would specify --executor-cores.
Read:
> http://spark.apache.org/docs/latest/running-on-yarn.html
>
> On Thu, Jan 15, 2015 at 8:54 PM, Wang, Ningjun (LNG-NPV) <ningjun.wang@lexisnexis.com>
wrote:
>> I have a standalone spark cluster with only one node with 4 CPU cores.
>> How can I force spark to do parallel processing of my RDD using
>> multiple threads? For example I can do the following
>>
>>
>>
>> Spark-submit  --master local[4]
>>
>>
>>
>> However I really want to use the cluster as follow
>>
>>
>>
>> Spark-submit  --master spark://10.125.21.15:7070
>>
>>
>>
>> In that case, how can I make sure the RDD is processed with multiple
>> threads/cores?
>>
>>
>>
>> Thanks
>>
>> Ningjun
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message