spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Santana <dan...@everymundo.com>
Subject Re: Spark Streaming Kinesis Performance Decrease When Cluster Scale Up with More Executors
Date Thu, 14 Jul 2016 19:59:55 GMT
Are you re-sharding your kinesis stream as well?

I had a similar problem and increasing the number of kinesis stream shards
solved it.

-- 
*Daniel Santana*
Senior Software Engineer

EVERY*MUNDO*
25 SE 2nd Ave., Suite 900
Miami, FL 33131 USA
main:+1 (305) 375-0045
EveryMundo.com <http://www.everymundo.com/#whoweare>

*Confidentiality Notice: *This email and any files transmitted with it are
confidential and intended solely for the use of the individual or entity to
whom they are addressed. If you have received this email in error, please
notify the system manager.

On Thu, Jul 14, 2016 at 2:20 PM, Renxia Wang <renxia.wang@gmail.com> wrote:

> Additional information: The batch duration in my app is 1 minute, from
> Spark UI, for each batch, the difference between Output Op Duration and Job
> Duration is big. E.g. Output Op Duration is 1min while Job Duration is 19s.
>
> 2016-07-14 10:49 GMT-07:00 Renxia Wang <renxia.wang@gmail.com>:
>
>> Hi all,
>>
>> I am running a Spark Streaming application with Kinesis on EMR 4.7.1. The
>> application runs on YARN and use client mode. There are 17 worker nodes
>> (c3.8xlarge) with 100 executors and 100 receivers. This setting works fine.
>>
>> But when I increase the number of worker nodes to 50, and increase the
>> number of executors to 250, with the 250 receivers, the processing time of
>> batches increase from ~50s to 2.3min, and scheduler delay for tasks
>> increase from ~0.2s max to 20s max (while 75th percentile is about 2-3s).
>>
>> I tried to only increase the number executors but keep the number of
>> receivers, but then I still see performance degrade from ~50s to 1.1min,
>> and for tasks the scheduler delay increased from ~0.2s max to 4s max (while
>> 75th percentile is about 1s).
>>
>> The spark-submit is as follow. The only parameter I changed here is the
>> num-executors.
>>
>> spark-submit
>> --deploy-mode client
>> --verbose
>> --master yarn
>> --jars /usr/lib/spark/extras/lib/spark-streaming-kinesis-asl.jar
>> --driver-memory 20g --driver-cores 20
>> --num-executors 250
>> --executor-cores 5
>> --executor-memory 8g
>> --conf spark.yarn.executor.memoryOverhead=1600
>> --conf spark.driver.maxResultSize=0
>> --conf spark.dynamicAllocation.enabled=false
>> --conf spark.rdd.compress=true
>> --conf spark.streaming.stopGracefullyOnShutdown=true
>> --conf spark.streaming.backpressure.enabled=true
>> --conf spark.speculation=true
>> --conf spark.task.maxFailures=15
>> --conf spark.ui.retainedJobs=100
>> --conf spark.ui.retainedStages=100
>> --conf spark.executor.logs.rolling.maxRetainedFiles=1
>> --conf spark.executor.logs.rolling.strategy=time
>> --conf spark.executor.logs.rolling.time.interval=hourly
>> --conf spark.scheduler.mode=FAIR
>> --conf spark.scheduler.allocation.file=/home/hadoop/fairscheduler.xml
>> --conf spark.metrics.conf=/home/hadoop/spark-metrics.properties
>> --class Main /home/hadoop/Main-1.0.jar
>>
>> I found this issue seems relevant:
>> https://issues.apache.org/jira/browse/SPARK-14327
>>
>> Any suggestion for me to troubleshoot this issue?
>>
>> Thanks,
>>
>> Renxia
>>
>>
>

Mime
View raw message