spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Bubna-Litic <>
Subject Creating external tables in Spark 2.0.0
Date Thu, 03 Nov 2016 04:39:51 GMT

When I create an external table in hive based on a parquet file in Spark 2.0.0, I am running
into an error that causes querying this table returns all nulls. I believe it is because Spark
SQL is using its own Parquet support instead of the Hive SerDe and there is potentially a
mismatch (however I have checked many times and cannot find anything. If I set spark.sql.hive.convertMetastoreParquet
to false, I am able to query the table and get the results. However, this has the side effect
of not being able to create Hive tables based on parquet files. It appears to be related to
this thread ( Note that
it does not appear to happen to every external table that I create.

Is this a bug or is it intentional?

Here is my workflow:
Create an external table
Querying the external table returns null but querying the parquet file is fine
Note that the tables have the same number of rows
If I set convertMetastoreParquet to false, I can query the external table

However if I try to then create a table using CTAS, it fails with an 'alter_table_with_cascade'

Anton Bubna-Litic
Level 25, 8-12 Chifley Square
Sydney NSW 2000

T: +61 2 8222 3585


The contents of this email, including attachments, may be confidential information. If you
are not the intended recipient, any use, disclosure or copying of the information is unauthorised.
If you have received this email in error, we would be grateful if you would notify us immediately
by email reply, phone (+ 61 2 9292 6400) or fax (+ 61 2 9292 6444) and delete the message
from your system.

View raw message