spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kpeng1 <>
Subject Creating a hive table on top of a parquet file written out by spark
Date Mon, 16 Mar 2015 17:55:22 GMT
Hi All,

I wrote out a complex parquet file from spark sql and now I am trying to put
a hive table on top.  I am running into issues with creating the hive table
itself.  Here is the json that I wrote out to parquet using spark sql:

I basically created a hive context and read in the json file using jsonFile
and then I wrote it back out using saveAsParquetFile.

Afterwards I was trying to create a hive table on top of the parquet file. 
Here is the hive hql that I have:
create table test (mycol STRUCT<user_id:String,
providers:ARRAY&lt;STRUCT&lt;id:String, name:String,
behaviors:MAP&lt;String, String>>>>) stored as parquet;
Alter table test set location 'hdfs:///tmp/test.parquet';

I get errors when I try to do a select * on the table:
Failed with exception
Column mycol at index 0 does not exist in {providers=providers,

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message