spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Holden Karau <hol...@pigscanfly.ca>
Subject Re: Should it be safe to embed Spark in Local Mode?
Date Wed, 20 Jul 2016 05:14:46 GMT
That's interesting and might be better suited to the dev list. I know in
some cases System exit off -1 were added so the task would be marked as
failure.

On Tuesday, July 19, 2016, Brett Randall <javabrett@gmail.com> wrote:

> This question is regarding
> https://issues.apache.org/jira/browse/SPARK-15685 (StackOverflowError
> (VirtualMachineError) or NoClassDefFoundError (LinkageError) should not
> System.exit() in local mode) and hopes to draw attention-to and
> discussion-on that issue.
>
> I have a product that is hosted as a microservice, running in a
> web-container e.g. Jetty, as a long-running service, publishing a REST
> API.  For small-computations, to reduce latency, I wish to run Spark in
> local mode.  For larger jobs the service might launch a remote job on a
> cluster e.g. Spark-on-YARN.  Either way, there may be custom modules
> deployed to the service from time-to-time, involving third-part libraries
> etc.
>
> My concern is as outlined in SPARK-15685.  If I have a third-party
> library, and either direct or transient dependencies are not satisfied,
> when the code is deployed and run I might suffer a NoClassDefFoundError.
> Or there may be some broken logic leading to a StackOverflowError
> (VirtualMachineError).  Normally if this occurred in a plan
> microservice/web-application, the thread handling the request would see the
> unchecked Throwable/Error and fail, but otherwise the service continues.
>
> With Spark in local mode, due to the quite-specific categorization and
> handling of the aforementioned specific Throwable/Error types (ref Utils.isFatalError
> and other Scala definitions), the result when they are thrown is that Spark
> deems that the JVM should be forcibly shutdown via System.exit(), thereby
> killing the microservice.
>
> Is it reasonable that in the face of the above Errors occuring, we should
> ask that Spark does not exit the JVM, instead allowing some exception or
> error to be thrown? The System.exit() approach seems aligned with the idea
> of a command-line job batch and a quick-exit of the entire JVM and any
> running threads, but it is poorly suited to running in local mode in a
> microservice.
>
> Thoughts?
>
> Thanks,
> Brett
>


-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Mime
View raw message