spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Malouf <malouf.g...@gmail.com>
Subject Re: Parquet schema migrations
Date Fri, 24 Oct 2014 21:40:12 GMT
Hi Michael,

Does this affect people who use Hive for their metadata store as well?  I'm
wondering if the issue is as bad as I think it is - namely that if you
build up a year's worth of data, adding a field forces you to have to
migrate that entire year's data.

Gary

On Wed, Oct 8, 2014 at 5:08 PM, Cody Koeninger <cody@koeninger.org> wrote:

> On Wed, Oct 8, 2014 at 3:19 PM, Michael Armbrust <michael@databricks.com>
> wrote:
>
> >
> > I was proposing you manually convert each different format into one
> > unified format  (by adding literal nulls and such for missing columns)
> and
> > then union these converted datasets.  It would be weird to have union all
> > try and do this automatically.
> >
>
>
> Sure, I was just musing on what an api for doing the merging without manual
> user input should look like / do.   I'll comment on the ticket, thanks for
> making it
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message