spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Whelan (JIRA)" <>
Subject [jira] [Commented] (SPARK-5358) spark.files.userClassPathFirst doesn't work correctly
Date Thu, 22 Jan 2015 19:04:38 GMT


Matthew Whelan commented on SPARK-5358:

I'll mark dupes, and I'm working on a PR.  

The fundamental issue is that you can't change the delegation scheme without overriding loadClass
(rather than findClass).  And, if you override loadClass, you kind of have to do it in Java,
because you need at static initializer call to register yourself as parallel-capable.  

Also, the idea of a mutable classloader is pretty terrifying.  You should never do that. 
It lets you load a class C at t1, then change the classloader (in a way that affects the loading
of C) and load C again, at t2.  Which class def gets loaded?  Depends.  This is very bad.

>From a cursory inspection, I can't tell if the lifecycle of Executor instances (to which
the classloaders are scoped) spans multiple SparkContexts.  Since each SparkContext fixes
the set of Jars for its jobs, that's a good scope boundary for the classloaders.  

> spark.files.userClassPathFirst doesn't work correctly
> -----------------------------------------------------
>                 Key: SPARK-5358
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.1.0, 1.2.0
>            Reporter: Matthew Whelan
> org.apache.spark.executor.ChildExecutorURLClassLoader#findClass delegates to two different
classloaders: a parent-less URL classloader, and the parent classloader, as a fallback.  The
delegation is broken such that calling loadClass twice in succession with the same parameters
will fail the second time.  
> The delegation to the userClassLoader calls findClass directly, which bypasses the classloader's
cache.  So userClassLoader will attempt to define the same class multiple times, throwing
LinkageErrors after the first time.
> The canonical way to change the default delegation scheme is to override loadClass, rather
than just findClass.  It also might be sufficient to have userClassLoader.findClass call super.loadClass.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message