spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: Dynamically InferSchema From Hive and Create parquet file
Date Wed, 05 Nov 2014 18:57:34 GMT
That method is for creating a new directory to hold parquet data when there
is no hive metastore available, thus you have to specify the schema.

If you've already created the table in the metastore you can just query it
using the sql method:

javahiveConxted.sql("SELECT * FROM parquetTable");

You can also load the data as a SchemaRDD without using the metastore since
parquet is self describing:

javahiveContext.parquetFile(".../path/to/parquetFiles").registerTempTable("parquetData")

On Wed, Nov 5, 2014 at 2:15 AM, Jahagirdar, Madhu <
madhu.jahagirdar@philips.com> wrote:

>  Currently the createParquetMethod needs BeanClass as one of the
> parameters.
>
>  javahiveContext.createParquetFile(XBean.class,
>
>
> IMPALA_TABLE_LOC, true, new Configuration())
>
>
> .registerTempTable(TEMP_TABLE_NAME);
>
>
>  Is it possible that we dynamically Infer Schema From Hive using hive
> context and the table name, then give that Schema ?
>
>
>  Regards.
>
> Madhu Jahagirdar
>
>
>
>
>
>
>
>
>
>
> ------------------------------
> The information contained in this message may be confidential and legally
> protected under applicable law. The message is intended solely for the
> addressee(s). If you are not the intended recipient, you are hereby
> notified that any use, forwarding, dissemination, or reproduction of this
> message is strictly prohibited and may be unlawful. If you are not the
> intended recipient, please contact the sender by return e-mail and destroy
> all copies of the original message.
>

Mime
View raw message