spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gourav Sengupta <gourav.sengu...@gmail.com>
Subject Re: [spark sql performance] Only 1 executor to write output?
Date Mon, 25 Mar 2019 10:06:45 GMT
Hi,

I think that articles and demonstrations from 2016 is really really old and
deprecated. What ofcourse does not change is the fact that we all have to
understand our data better.

That said, try to go through this, I am not quite sure whether this feature
is available in APACHE spark or not, but it certainly gives some idea:
https://docs.databricks.com/delta/join-performance/skew-join.html



Regards,
Gourav

On Mon, Mar 25, 2019 at 4:30 AM Akhil Das <akhld@hacked.work> wrote:

> Looks like a data skew, find a better partitioning on the dataset. This
> might help
>
> https://databricks.com/session/handling-data-skew-adaptively-in-spark-using-dynamic-repartitioning
>
> On Sun, 24 Mar 2019, 22:49 Mike Chan, <mikechancs@gmail.com> wrote:
>
>> Dear all,
>>
>> I have a spark sql that used to execute < 10 mins now running at 3 hours
>> after a cluster migration and need to deep dive on what it's actually
>> doing. I'm new to spark and please don't mind if I'm asking something
>> unrelated.
>>
>> Env: Azure HDinsight spark 2.4 on Azure storage
>> SQL: Read and Join some data and finally write result to a Hive metastore
>>
>> Application Behavior:
>> Within the first 15 mins, it loads and complete most tasks (199/200);
>> left only 1 executor process alive and continually to shuffle read / write
>> data. Because now it only leave 1 executor, we need to wait 3 hours until
>> this application finish.
>> [image: image.png]
>>
>> Left only 1 executor alive
>> [image: image.png]
>>
>> Not sure what's the executor doing:
>> [image: image.png]
>>
>> From time to time, we can tell the shuffle read increased:
>> [image: image.png]
>>
>> Therefore I increased the spark.executor.memory to 20g, but nothing
>> changed. From Ambari and YARN I can tell the cluster has many resources
>> left.
>> [image: image.png]
>>
>> Release of almost all executor
>> [image: image.png]
>>
>> The sparl.sql ends with below code:
>> .write.mode("overwrite").saveAsTable("default.mikemiketable")
>>
>> As you can tell I'm new to spark and not 100% getting what's going on
>> here. The huge shuffle spill looks ugly, but they probably not the reason
>> of slow execution - the reason why only 1 executor doing the job it is.
>> Greatly appreciate if you can share how to troubleshot / further look into
>> it. Thank you very much.
>>
>> Best Regards,
>> Mik
>>
>

Mime
View raw message