spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Takeshi Yamamuro <linguin....@gmail.com>
Subject Re: Spark.default.parallelism can not set reduce number
Date Fri, 20 May 2016 11:20:06 GMT
You need to use `spark.sql.shuffle.partitions`.

// maropu

On Fri, May 20, 2016 at 8:17 PM, ε–œδΉ‹ιƒŽ <251922566@qq.com> wrote:

>  Hi all.
> I set Spark.default.parallelism equals 20 in spark-default.conf. And send
> this file to all nodes.
> But I found reduce number is still default value,200.
> Does anyone else encouter this problem? can anyone give some advice?
>
> ############
> [Stage 9:>                                                        (0 + 0)
> / 200]
> [Stage 9:>                                                        (0 + 2)
> / 200]
> [Stage 9:>                                                        (1 + 2)
> / 200]
> [Stage 9:>                                                        (2 + 2)
> / 200]
> #######
>
> And this results in many empty files.Because my data is little, only some
> of the 200 files have data.
> #######
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00000
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00001
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00002
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00003
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00004
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00005
> ########
>
>
>
>


-- 
---
Takeshi Yamamuro

Mime
View raw message