spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JF Chen <darou...@gmail.com>
Subject How to increase the parallelism of Spark Streaming application´╝č
Date Wed, 07 Nov 2018 07:27:48 GMT
I have a Spark Streaming application which reads data from kafka and save
the the transformation result to hdfs.
My original partition number of kafka topic is 8, and repartition the data
to 100 to increase the parallelism of spark job.
Now I am wondering if I increase the kafka partition number to 100 instead
of setting repartition to 100, will the performance be enhanced? (I know
repartition action cost a lot cpu resource)
If I set the kafka partition number to 100, does it have any negative
efficiency?
I just have one production environment so it's not convenient for me to do
the test....

Thanks!

Regard,
Junfeng Chen

Mime
View raw message