spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Karlupia <roh...@qubole.com>
Subject Re: Open sourcing Sparklens: Qubole's Spark Tuning Tool
Date Sun, 25 Mar 2018 09:46:44 GMT
Thanks Shamuel for trying out sparklens!

Couple of things that I noticed:
1) 250 executors is probably overkill for this job. It would run in same
time with around 100.
2) Many of stages that take long time have only 200 tasks where as we have
750 cores available for the job. 200 is the default value for
spark.sql.shuffle.partitions.  Alternatively you could try increasing the
value of spark.sql.shuffle.partitions to latest 750.

thanks,
rohitk

On Sun, Mar 25, 2018 at 1:25 PM, Shmuel Blitz <shmuel.blitz@similarweb.com>
wrote:

> I ran it on a single job.
> SparkLens has an overhead on the job duration. I'm not ready to enable it
> by default on all our jobs.
>
> Attached is the output.
>
> Still trying to understand what exactly it means.
>
> On Sun, Mar 25, 2018 at 10:40 AM, Fawze Abujaber <fawzeaj@gmail.com>
> wrote:
>
>> Nice!
>>
>> Shmuel, Were you able to run on a cluster level or for a specific job?
>>
>> Did you configure it on the spark-default.conf?
>>
>> On Sun, 25 Mar 2018 at 10:34 Shmuel Blitz <shmuel.blitz@similarweb.com>
>> wrote:
>>
>>> Just to let you know, I have managed to run SparkLens on our cluster.
>>>
>>> I switched to the spark_1.6 branch, and also compiled against the
>>> specific image of Spark we are using (cdh5.7.6).
>>>
>>> Now I need to figure out what the output means... :P
>>>
>>> Shmuel
>>>
>>> On Fri, Mar 23, 2018 at 7:24 PM, Fawze Abujaber <fawzeaj@gmail.com>
>>> wrote:
>>>
>>>> Quick question:
>>>>
>>>> how to add the  --jars /path/to/sparklens_2.11-0.1.0.jar to the
>>>> spark-default conf, should it be using:
>>>>
>>>> spark.driver.extraClassPath /path/to/sparklens_2.11-0.1.0.jar or i
>>>> should use spark.jars option? anyone who could give an example how it
>>>> should be, and if i the path for the jar should be an hdfs path as i'm
>>>> using it in cluster mode.
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Mar 23, 2018 at 6:33 AM, Fawze Abujaber <fawzeaj@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Shmuel,
>>>>>
>>>>> Did you compile the code against the right branch for Spark 1.6.
>>>>>
>>>>> I tested it and it looks working and now i'm testing the branch for a
>>>>> wide tests, Please use the branch for Spark 1.6
>>>>>
>>>>> On Fri, Mar 23, 2018 at 12:43 AM, Shmuel Blitz <
>>>>> shmuel.blitz@similarweb.com> wrote:
>>>>>
>>>>>> Hi Rohit,
>>>>>>
>>>>>> Thanks for sharing this great tool.
>>>>>> I tried running a spark job with the tool, but it failed with an
*IncompatibleClassChangeError
>>>>>> *Exception.
>>>>>>
>>>>>> I have opened an issue on Github.(https://github.com/qub
>>>>>> ole/sparklens/issues/1)
>>>>>>
>>>>>> Shmuel
>>>>>>
>>>>>> On Thu, Mar 22, 2018 at 5:05 PM, Shmuel Blitz <
>>>>>> shmuel.blitz@similarweb.com> wrote:
>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> We will give this a try and report back.
>>>>>>>
>>>>>>> Shmuel
>>>>>>>
>>>>>>> On Thu, Mar 22, 2018 at 4:22 PM, Rohit Karlupia <rohitk@qubole.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks everyone!
>>>>>>>> Please share how it works and how it doesn't. Both help.
>>>>>>>>
>>>>>>>> Fawaze, just made few changes to make this work with spark
1.6. Can
>>>>>>>> you please try building from branch *spark_1.6*
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>> rohitk
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Mar 22, 2018 at 10:18 AM, Fawze Abujaber <fawzeaj@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> It's super amazing .... i see it was tested on spark
2.0.0 and
>>>>>>>>> above, what about Spark 1.6 which is still part of Cloudera's
main versions?
>>>>>>>>>
>>>>>>>>> We have a vast Spark applications with version 1.6.0
>>>>>>>>>
>>>>>>>>> On Thu, Mar 22, 2018 at 6:38 AM, Holden Karau <
>>>>>>>>> holden@pigscanfly.ca> wrote:
>>>>>>>>>
>>>>>>>>>> Super exciting! I look forward to digging through
it this weekend.
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 21, 2018 at 9:33 PM ☼ R Nair (रविशंकर
नायर) <
>>>>>>>>>> ravishankar.nair@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Excellent. You filled a missing link.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Passion
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 21, 2018 at 11:36 PM, Rohit Karlupia
<
>>>>>>>>>>> rohitk@qubole.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Happy to announce the availability of Sparklens
as open source
>>>>>>>>>>>> project. It helps in understanding the  scalability
limits of spark
>>>>>>>>>>>> applications and can be a useful guide on
the path towards tuning
>>>>>>>>>>>> applications for lower runtime or cost.
>>>>>>>>>>>>
>>>>>>>>>>>> Please clone from here: https://github.com/qubole/sparklens
>>>>>>>>>>>> Old blogpost: https://www.qubole.c
>>>>>>>>>>>> om/blog/introducing-quboles-spark-tuning-tool/
>>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>> rohitk
>>>>>>>>>>>>
>>>>>>>>>>>> PS: Thanks for the patience. It took couple
of months to get
>>>>>>>>>>>> back on this.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Shmuel Blitz
>>>>>>> Big Data Developer
>>>>>>> Email: shmuel.blitz@similarweb.com
>>>>>>> www.similarweb.com
>>>>>>> <https://www.facebook.com/SimilarWeb/>
>>>>>>> <https://www.linkedin.com/company/429838/>
>>>>>>> <https://twitter.com/similarweb>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Shmuel Blitz
>>>>>> Big Data Developer
>>>>>> Email: shmuel.blitz@similarweb.com
>>>>>> www.similarweb.com
>>>>>> <https://www.facebook.com/SimilarWeb/>
>>>>>> <https://www.linkedin.com/company/429838/>
>>>>>> <https://twitter.com/similarweb>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Shmuel Blitz
>>> Big Data Developer
>>> Email: shmuel.blitz@similarweb.com
>>> www.similarweb.com
>>> <https://www.facebook.com/SimilarWeb/>
>>> <https://www.linkedin.com/company/429838/>
>>> <https://twitter.com/similarweb>
>>>
>>
>
>
> --
> Shmuel Blitz
> Big Data Developer
> Email: shmuel.blitz@similarweb.com
> www.similarweb.com
> <https://www.facebook.com/SimilarWeb/>
> <https://www.linkedin.com/company/429838/>
> <https://twitter.com/similarweb>
>

Mime
View raw message