spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sadhan Sood <>
Subject SparkSQL - can we add new column(s) to parquet files
Date Fri, 21 Nov 2014 18:03:53 GMT
We create the table definition by reading the parquet file for schema and
store it in hive metastore. But if someone adds a new column to the schema,
and if we rescan the schema from the new parquet files and update the table
definition, would it still work if we run queries on the table ?

So, old table has -> Int a, Int b
new table -> Int a, Int b, String c

but older parquet files don't have String c, so on querying the table would
it return me null for column c  from older files and data from newer files
or fail?

View raw message