hi, Mayur, thanks for replying.
I know spark application should take all cores by default. My question is  how to set task number on each core ?
If one silce, one task,  how can i set silce file size ?


2014-05-23 16:37 GMT+08:00 Mayur Rustagi <mayur.rustagi@gmail.com>:
How many cores do you see on your spark master (8080 port). 
By default spark application should take all cores when you launch it. Unless you have set max core configuration. 


Mayur Rustagi
Ph: +1 (760) 203 3257


On Thu, May 22, 2014 at 4:07 PM, qingyang li <liqingyang1985@gmail.com> wrote:
my aim of setting task number is to increase the query speed,    and I have also found " mapPartitionsWithIndex at Operator.scala:333"  is costing much time.  so, my another question is :
how to tunning mapPartitionsWithIndex  to make the costing time down?




2014-05-22 18:09 GMT+08:00 qingyang li <liqingyang1985@gmail.com>:

i have added  SPARK_JAVA_OPTS+="-Dspark.
default.parallelism=40 "  in shark-env.sh,  
but i find there are only10 tasks on the cluster and 2 tasks each machine.


2014-05-22 18:07 GMT+08:00 qingyang li <liqingyang1985@gmail.com>:

i have added  SPARK_JAVA_OPTS+="-Dspark.default.parallelism=40 "  in shark-env.sh


2014-05-22 17:50 GMT+08:00 qingyang li <liqingyang1985@gmail.com>:

i am using tachyon as storage system and using to shark to query a table which is a bigtable, i have 5 machines as a spark cluster, there are 4 cores on each machine .
My question is:
1. how to set task number on each core?
2. where to see how many partitions of one RDD?