spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: Spark SQL / Parquet - Dynamic Schema detection
Date Mon, 14 Mar 2016 17:59:44 GMT
>
> Each json file is of a single object and has the potential to have
> variance in the schema.
>
How much variance are we talking?  JSON->Parquet is going to do well with
100s of different columns, but at 10,000s many things will probably start
breaking.

Mime
View raw message