spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Dang <nam...@gmail.com>
Subject Re: Best Practice for Spark Job Jar Generation
Date Fri, 23 Dec 2016 15:04:40 GMT
I used to use uber jar in Spark 1.x because of classpath issues (we
couldn't re-model our dependencies based on our code, and thus cluster's
run dependencies could be very different from running Spark directly in the
IDE. We had to use userClasspathFirst "hack" to work around this.

With Spark 2, it's easier to replace dependencies (say, Guava) than before.
We moved away from deploying superjar and just pass the libraries as part
of Spark jars (still can't use Guava v19 or later because Spark uses a
deprecated method that's not available, but that's not a big issue for us).

-------
Regards,
Andy

On Fri, Dec 23, 2016 at 6:44 AM, Chetan Khatri <chetan.opensource@gmail.com>
wrote:

> Hello Spark Community,
>
> For Spark Job Creation I use SBT Assembly to build Uber("Super") Jar and
> then submit to spark-submit.
>
> Example,
>
> bin/spark-submit --class hbase.spark.chetan.com.SparkHbaseJob
> /home/chetan/hbase-spark/SparkMSAPoc-assembly-1.0.jar
>
> But other folks has debate with for Uber Less Jar, Guys can you please
> explain me best practice industry standard for the same.
>
> Thanks,
>
> Chetan Khatri.
>

Mime
View raw message