spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: [Thrift,1.2 RC] what happened to parquet.hive.serde.ParquetHiveSerDe
Date Wed, 03 Dec 2014 19:05:18 GMT
Thanks for reporting. As a workaround you should be able to SET
spark.sql.hive.convertMetastoreParquet=false, but I'm going to try to fix
this before the next RC.

On Wed, Dec 3, 2014 at 7:09 AM, Yana Kadiyska <yana.kadiyska@gmail.com>
wrote:

> Thanks Michael, you are correct.
>
> I also opened https://issues.apache.org/jira/browse/SPARK-4702 -- if
> someone can comment on why this might be happening that would be great.
> This would be a blocker to me using 1.2 and it used to work so I'm a bit
> puzzled. I was hoping that it's again a result of the default profile
> switch but it didn't seem to be the case
>
> (ps. please advise if this is more user-list appropriate. I'm posting to
> dev as it's an RC)
>
> On Tue, Dec 2, 2014 at 8:37 PM, Michael Armbrust <michael@databricks.com>
> wrote:
>
>> In Hive 13 (which is the default for Spark 1.2), parquet is included and
>> thus we no longer include the Hive parquet bundle. You can now use the
>> included
>> ParquetSerDe: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
>>
>> If you want to compile Spark 1.2 with Hive 12 instead you can pass
>> -Phive-0.12.0 and  parquet.hive.serde.ParquetHiveSerDe will be included as
>> before.
>>
>> Michael
>>
>> On Tue, Dec 2, 2014 at 9:31 AM, Yana Kadiyska <yana.kadiyska@gmail.com>
>> wrote:
>>
>>> Apologies if people get this more than once -- I sent mail to dev@spark
>>> last night and don't see it in the archives. Trying the incubator list
>>> now...wanted to make sure it doesn't get lost in case it's a bug...
>>>
>>> ---------- Forwarded message ----------
>>> From: Yana Kadiyska <yana.kadiyska@gmail.com>
>>> Date: Mon, Dec 1, 2014 at 8:10 PM
>>> Subject: [Thrift,1.2 RC] what happened to
>>> parquet.hive.serde.ParquetHiveSerDe
>>> To: dev@spark.apache.org
>>>
>>>
>>> Hi all, apologies if this is not a question for the dev list -- figured
>>> User list might not be appropriate since I'm having trouble with the RC
>>> tag.
>>>
>>> I just tried deploying the RC and running ThriftServer. I see the
>>> following
>>> error:
>>>
>>> 14/12/01 21:31:42 ERROR UserGroupInformation: PriviledgedActionException
>>> as:anonymous (auth:SIMPLE)
>>> cause:org.apache.hive.service.cli.HiveSQLException:
>>> java.lang.RuntimeException:
>>> MetaException(message:java.lang.ClassNotFoundException Class
>>> parquet.hive.serde.ParquetHiveSerDe not found)
>>> 14/12/01 21:31:42 WARN ThriftCLIService: Error executing statement:
>>> org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException:
>>> MetaException(message:java.lang.ClassNotFoundException Class
>>> parquet.hive.serde.ParquetHiveSerDe not found)
>>> at
>>>
>>> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:192)
>>> at
>>>
>>> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
>>> at
>>>
>>> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:212)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>>
>>> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
>>> at
>>>
>>> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
>>> at
>>>
>>> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> ‚Äč
>>>
>>>
>>> I looked at a working installation that I have(build master a few weeks
>>> ago) and this class used to be included in spark-assembly:
>>>
>>> ls *.jar|xargs grep parquet.hive.serde.ParquetHiveSerDe
>>> Binary file spark-assembly-1.2.0-SNAPSHOT-hadoop2.0.0-mr1-cdh4.2.0.jar
>>> matches
>>>
>>> but with the RC build it's not there?
>>>
>>> I tried both the prebuilt CDH drop and later manually built the tag with
>>> the following command:
>>>
>>>  ./make-distribution.sh --tgz -Phive -Dhadoop.version=2.0.0-mr1-cdh4.2.0
>>> -Phive-thriftserver
>>> $JAVA_HOME/bin/jar -tvf spark-assembly-1.2.0-hadoop2.0.0-mr1-cdh4.2.0.jar
>>> |grep parquet.hive.serde.ParquetHiveSerDe
>>>
>>> comes back empty...
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message