spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apostolos N. Papadopoulos" <papad...@csd.auth.gr>
Subject Re: Parallelism: behavioural difference in version 1.2 and 2.1!?
Date Wed, 29 Aug 2018 14:06:51 GMT
Dear Jeevan,

Spark 1.2 is quite old, and If I were you I would go for a newer version.

However, is there a parallelism level (e.g., 20, 30) that works for both 
installations?

regards,

Apostolos



On 29/08/2018 04:55 μμ, jeevan.ks wrote:
> Hi,
>
> I've two systems. One is built on Spark 1.2 and the other on 2.1. I am
> benchmarking both with the same benchmarks (wordcount, grep, sort, etc.)
> with the same data set from S3 bucket (size ranges from 50MB to 10 GB). The
> Spark cluster I made use of is r3.xlarge, 8 instances, 4 cores each, and
> 28GB RAM. I observed a strange behaviour while running the benchmarks and is
> as follows:
>
> - When I ran Spark 1.2 version with default partition number
> (sc.defaultParallelism), the jobs would take forever to complete. So I
> changed it to the number of cores, i.e., 32 times 3 = 96. This did a magic
> and the jobs completed quickly.
>
> - However, when I tried the above magic number on the version 2.1, the jobs
> are taking forever. Deafult parallelism works better, but not that
> efficient.
>
> I'm having problem to rationalise this and compare both the systems. My
> question is: what changes were made from 1.2 to 2.1 with respect to default
> parallelism for this behaviour to occur? How can I have both versions behave
> similary on the same software/hardware configuration so that I can compare?
>
> I'd really appreciate your help on this!
>
> Cheers,
> Jeevan
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>

-- 
Apostolos N. Papadopoulos, Associate Professor
Department of Informatics
Aristotle University of Thessaloniki
Thessaloniki, GREECE
tel: ++0030312310991918
email: papadopo@csd.auth.gr
twitter: @papadopoulos_ap
web: http://delab.csd.auth.gr/~apostol


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message