spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hamstra <m...@clearstorydata.com>
Subject Re: [SPARK on MESOS] Avoid re-fetching Spark binary
Date Tue, 10 Jul 2018 15:49:36 GMT
It's been done many times before by many organizations. Use Spark Job
Server or Livy or create your own implementation of a similar long-running
Spark Application. Creating a new Application for every Job is not the way
to achieve low-latency performance.

On Tue, Jul 10, 2018 at 4:18 AM <tphan.dat@gmail.com> wrote:

> Dear,
>
> Our jobs are triggered by users on demand.
> And new job will be submitted to Spark server via REST API. The 2-4
> seconds of latency is mainly because of the initialization of SparkContext
> every time new job is submitted, as you have mentioned.
>
> If you are aware of a way to avoid this initialization, could you please
> share it. That would be perfect for our case.
>
> Best
> Tien Dat
>
> <quote author='Mark Hamstra'>
> Essentially correct. The latency to start a Spark Job is nowhere close to
> 2-4 seconds under typical conditions. Creating a new Spark Application
> every time instead of running multiple Jobs in one Application is not going
> to lead to acceptable interactive or real-time performance, nor is that an
> execution model that Spark is ever likely to support in trying to meet
> low-latency requirements. As such, reducing Application startup time (not
> Job startup time) is not a priority.
>
> On Fri, Jul 6, 2018 at 4:06 PM Timothy Chen <tnachen@gmail.com> wrote:
>
> > I know there are some community efforts shown in Spark summits before,
> > mostly around reusing the same Spark context with multiple “jobs”.
> >
> > I don’t think reducing Spark job startup time is a community priority
> > afaik.
> >
> > Tim
> > On Fri, Jul 6, 2018 at 7:12 PM Tien Dat <tphan.dat@gmail.com> wrote:
> >
> >> Dear Timothy,
> >>
> >> It works like a charm now.
> >>
> >> BTW (don't judge me if I am to greedy :-)), the latency to start a Spark
> >> job
> >> is around 2-4 seconds, unless I am not aware of some awesome
> optimization
> >> on
> >> Spark. Do you know if Spark community is working on reducing this
> >> latency?
> >>
> >> Best
> >>
> >>
> >>
> >> --
> >> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> >>
> >>
>
> </quote>
> Quoted from:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/SPARK-on-MESOS-Avoid-re-fetching-Spark-binary-tp32849p32865.html
>
>
> _____________________________________
> Sent from http://apache-spark-user-list.1001560.n3.nabble.com
>
>

Mime
View raw message