spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Khatri <chetan.opensou...@gmail.com>
Subject Re: Best Practice for Spark Job Jar Generation
Date Fri, 23 Dec 2016 18:23:21 GMT
Correct, so the approach you suggested and Uber Jar Approach. What i think
that Uber Jar approach is best practice because if you wish to do
environment migration then would be easy. and Performance wise also Uber
Jar Approach would be more optimised rather than Uber less approach.

Thanks.

On Fri, Dec 23, 2016 at 11:41 PM, Andy Dang <namd88@gmail.com> wrote:

> We remodel Spark dependencies and ours together and chuck them under the
> /jars path. There are other ways to do it but we want the classpath to be
> strictly as close to development as possible.
>
> -------
> Regards,
> Andy
>
> On Fri, Dec 23, 2016 at 6:00 PM, Chetan Khatri <
> chetan.opensource@gmail.com> wrote:
>
>> Andy, Thanks for reply.
>>
>> If we download all the dependencies at separate location  and link with
>> spark job jar on spark cluster, is it best way to execute spark job ?
>>
>> Thanks.
>>
>> On Fri, Dec 23, 2016 at 8:34 PM, Andy Dang <namd88@gmail.com> wrote:
>>
>>> I used to use uber jar in Spark 1.x because of classpath issues (we
>>> couldn't re-model our dependencies based on our code, and thus cluster's
>>> run dependencies could be very different from running Spark directly in the
>>> IDE. We had to use userClasspathFirst "hack" to work around this.
>>>
>>> With Spark 2, it's easier to replace dependencies (say, Guava) than
>>> before. We moved away from deploying superjar and just pass the libraries
>>> as part of Spark jars (still can't use Guava v19 or later because Spark
>>> uses a deprecated method that's not available, but that's not a big issue
>>> for us).
>>>
>>> -------
>>> Regards,
>>> Andy
>>>
>>> On Fri, Dec 23, 2016 at 6:44 AM, Chetan Khatri <
>>> chetan.opensource@gmail.com> wrote:
>>>
>>>> Hello Spark Community,
>>>>
>>>> For Spark Job Creation I use SBT Assembly to build Uber("Super") Jar
>>>> and then submit to spark-submit.
>>>>
>>>> Example,
>>>>
>>>> bin/spark-submit --class hbase.spark.chetan.com.SparkHbaseJob
>>>> /home/chetan/hbase-spark/SparkMSAPoc-assembly-1.0.jar
>>>>
>>>> But other folks has debate with for Uber Less Jar, Guys can you please
>>>> explain me best practice industry standard for the same.
>>>>
>>>> Thanks,
>>>>
>>>> Chetan Khatri.
>>>>
>>>
>>>
>>
>

Mime
View raw message