spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Vanzin <van...@cloudera.com>
Subject Re: Spark's Guava pieces cause exceptions in non-trivial deployments
Date Fri, 15 May 2015 21:42:09 GMT
On Fri, May 15, 2015 at 2:35 PM, Thomas Dudziak <tomdzk@gmail.com> wrote:

> I've just been through this exact case with shaded guava in our Mesos
> setup and that is how it behaves there (with Spark 1.3.1).
>

If that's the case, it's a bug in the Mesos backend, since the spark.*
options should behave exactly the same as SPARK_CLASSPATH. It would be nice
to know whether that is also the case in 1.4 (I took a quick look at the
related code and it seems correct), but I don't have Mesos around to test.




>
> On Fri, May 15, 2015 at 12:04 PM, Marcelo Vanzin <vanzin@cloudera.com>
> wrote:
>
>> On Fri, May 15, 2015 at 11:56 AM, Thomas Dudziak <tomdzk@gmail.com>
>> wrote:
>>
>>> Actually the extraClassPath settings put the extra jars at the end of
>>> the classpath so they won't help. Only the deprecated SPARK_CLASSPATH puts
>>> them at the front.
>>>
>>
>> That's definitely not the case for YARN:
>>
>> https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L1013
>>
>> And it's been like that for as far as I remember.
>>
>> I'm almost sure that's also the case for standalone, at least in master /
>> 1.4, since I touched a lot of that code recently.
>>
>> It would be really weird if those options worked differently from
>> SPARK_CLASSPATH, since they were meant to replace it.
>>
>>
>> On Fri, May 15, 2015 at 11:54 AM, Marcelo Vanzin <vanzin@cloudera.com>
>>> wrote:
>>>
>>>> Ah, I see. yeah, it sucks that Spark has to expose Optional (and things
>>>> it depends on), but removing that would break the public API, so...
>>>>
>>>> One last thing you could try is do add your newer Guava jar to
>>>> "spark.driver.extraClassPath" and "spark.executor.extraClassPath". Those
>>>> settings will place your jars before Spark's in the classpath, so you'd
>>>> actually be using the newer versions of the conflicting classes everywhere.
>>>>
>>>> It does require manually distributing the Guava jar to the same
>>>> location on all nodes in the cluster, though.
>>>>
>>>> If that doesn't work. Thomas's suggestion of shading Guava in your app
>>>> can be used as a last resort.
>>>>
>>>>
>>>> On Thu, May 14, 2015 at 7:38 PM, Anton Brazhnyk <
>>>> anton.brazhnyk@genesys.com> wrote:
>>>>
>>>>>  The problem is with 1.3.1
>>>>>
>>>>> It has Function class (mentioned in exception) in
>>>>> spark-network-common_2.10-1.3.1.jar.
>>>>>
>>>>> Our current resolution is actually backport to 1.2.2, which is working
>>>>> fine.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *From:* Marcelo Vanzin [mailto:vanzin@cloudera.com]
>>>>> *Sent:* Thursday, May 14, 2015 6:27 PM
>>>>> *To:* Anton Brazhnyk
>>>>> *Cc:* user@spark.apache.org
>>>>> *Subject:* Re: Spark's Guava pieces cause exceptions in non-trivial
>>>>> deployments
>>>>>
>>>>>
>>>>>
>>>>> What version of Spark are you using?
>>>>>
>>>>> The bug you mention is only about the Optional class (and a handful of
>>>>> others, but none of the classes you're having problems with). All other
>>>>> Guava classes should be shaded since Spark 1.2, so you should be able
to
>>>>> use your own version of Guava with no problems (aside from the Optional
>>>>> classes).
>>>>>
>>>>> Also, Spark 1.3 added some improvements to how shading is done, so if
>>>>> you're using 1.2 I'd recommend trying 1.3 before declaring defeat.
>>>>>
>>>>>
>>>>>
>>>>> On Thu, May 14, 2015 at 4:52 PM, Anton Brazhnyk <
>>>>> anton.brazhnyk@genesys.com> wrote:
>>>>>
>>>>>  Greetings,
>>>>>
>>>>>
>>>>>
>>>>> I have a relatively complex application with Spark, Jetty and Guava
>>>>> (16) not fitting together.
>>>>>
>>>>> Exception happens when some components try to use “mix” of Guava
>>>>> classes (including Spark’s pieces) that are loaded by different
>>>>> classloaders:
>>>>>
>>>>> java.lang.LinkageError: loader constraint violation: when resolving
>>>>> method
>>>>> "com.google.common.collect.Iterables.transform(Ljava/lang/Iterable;Lcom/google/common/base/Function;)Ljava/lang/Iterable;"
>>>>> the class loader (instance of org/eclipse/jetty/webapp/WebAppClassLoader)
>>>>> of the current class, org/apache/cassandra/db/ColumnFamilyStore, and
the
>>>>> class loader (instance of java/net/URLClassLoader) for resolved class,
>>>>> com/google/common/collect/Iterables, have different Class objects for
the
>>>>> type e;Lcom/google/common/base/Function;)Ljava/lang/Iterable; used in
the
>>>>> signature
>>>>>
>>>>>
>>>>>
>>>>> According to https://issues.apache.org/jira/browse/SPARK-4819 it’s
>>>>> not going to be fixed at least until Spark 2.0, but maybe some workaround
>>>>> is possible?
>>>>>
>>>>> Those classes are pretty simple and have low chances to be changed in
>>>>> Guava significantly, so any “external” Guava can provide them.
>>>>>
>>>>>
>>>>>
>>>>> So, could such problems be fixed if those Spark’s pieces of Guava
>>>>> would be in separate jar and could be excluded from the mix (substituted
by
>>>>> “external” Guava)?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Anton
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Marcelo
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Marcelo
>>>>
>>>
>>>
>>
>>
>> --
>> Marcelo
>>
>
>


-- 
Marcelo

Mime
View raw message