spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: Cannot instantiate hive context
Date Tue, 04 Nov 2014 07:55:30 GMT
Not quiet sure, but moving the Guava 11 jar to first position in the
classpath may solve this issue.

Thanks
Best Regards

On Tue, Nov 4, 2014 at 1:47 AM, Pala M Muthaia <mchettiar@rocketfuelinc.com>
wrote:

> Thanks Akhil.
>
> I realized that earlier, and i thought mvn -Phive should have captured and
> included all these dependencies.
>
> In any case, i proceeded with that, included other such dependencies that
> were missing, and  finally hit the guava version mismatch issue. (Spark
> with Guava 14 vs Hadoop/Hive with Guava 11). There are 2 parts:
>
> 1. Spark includes Guava library within its jars and that may conflict with
> Hadoop/Hive components depending on older version of the library.
>
> It seems this has been solved with SPARK-2848
> <https://issues.apache.org/jira/browse/SPARK-2848> patch to shade the
> Guava libraries.
>
>
> 2. Spark actually uses interfaces from newer version of Guava library,
> that needs to be rewritten to use older version (i.e. downgrade Spark
> dependency on Guava).
>
> I wasn't able to find the related patches (I need them since i am on Spark
> 1.0.1). Applying patch for #1 above, i still hit the following error:
>
> 14/11/03 15:01:32 WARN storage.BlockManager: Putting block broadcast_0
> failed
> java.lang.NoSuchMethodError:
> com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
>         at org.apache.spark.util.collection.OpenHashSet.org
> $apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
>         at
> org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
>         at
> org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
> .... <stack continues>
>
> I haven't been able to find the other patches that actually downgrade the
> dependency.
>
>
> Please point me to those patches, or any other ideas about fixing these
> dependency issues.
>
>
> Thanks.
>
>
>
> On Sun, Nov 2, 2014 at 8:41 AM, Akhil Das <akhil@sigmoidanalytics.com>
> wrote:
>
>> Adding the libthrift jar
>> <http://mvnrepository.com/artifact/org.apache.thrift/libthrift/0.9.0> in
>> the class path would resolve this issue.
>>
>> Thanks
>> Best Regards
>>
>> On Sat, Nov 1, 2014 at 12:34 AM, Pala M Muthaia <
>> mchettiar@rocketfuelinc.com> wrote:
>>
>>> Hi,
>>>
>>> I am trying to load hive datasets using HiveContext, in spark shell.
>>> Spark ver 1.0.1 and Hive ver 0.12.
>>>
>>> We are trying to get Spark work with hive datasets. I already have
>>> existing Spark deployment. Following is what i did on top of that:
>>> 1. Build spark using 'mvn -Pyarn,hive -Phadoop-2.4
>>> -Dhadoop.version=2.4.0 -DskipTests clean package'
>>> 2. Copy over spark-assembly-1.0.1-hadoop2.4.0.jar into spark deployment
>>> directory.
>>> 3. Launch spark-shell with the spark hive jar included in the list.
>>>
>>> When i execute *'*
>>>
>>> *val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)*
>>>
>>> i get the following error stack:
>>>
>>> java.lang.NoClassDefFoundError: org/apache/thrift/TBase
>>>         at java.lang.ClassLoader.defineClass1(Native Method)
>>>         at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
>>>         at
>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>>>         ....
>>>         at
>>> org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303)
>>>         at
>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
>>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>> Caused by: java.lang.ClassNotFoundException: org.apache.thrift.TBase
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>         ... 55 more
>>>
>>> I thought that building with -Phive option should include all the
>>> necessary hive packages into the assembly jar (according to here
>>> <https://spark.apache.org/docs/1.0.1/sql-programming-guide.html#hive-tables>).
>>> I tried searching online and in this mailing list archive but haven't found
>>> any instructions on how to get this working.
>>>
>>> I know that there is additional step of updating the assembly jar across
>>> the whole cluster, not just client side, but right now, even the client is
>>> not working.
>>>
>>> Would appreciate instructions (or link to them) on how to get this
>>> working end-to-end.
>>>
>>>
>>> Thanks,
>>> pala
>>>
>>
>>
>

Mime
View raw message