spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tobias Pfeiffer <...@preferred.jp>
Subject Re: [SQL] Conflicts in inferred Json Schemas
Date Mon, 26 Jan 2015 01:41:10 GMT
Hi,

On Thu, Jan 22, 2015 at 2:26 AM, Corey Nolet <cjnolet@gmail.com> wrote:

> Let's say I have 2 formats for json objects in the same file
> schema1 = { "location": "12345 My Lane" }
> schema2 = { "location":{"houseAddres":"1234 My Lane"} }
>
> From my tests, it looks like the current inferSchema() function will end
> up with only StructField("location", StringType).
>

In Spark SQL columns need to have a well-defined type (as in SQL in
general). So "inferring the schema" requires that there is a "schema", and
I am afraid that there is not an easy way to achieve what you want in Spark
SQL, as there is no data type covering both values you see. (I am pretty
sure it can be done if you dive deep into the internals, add data types
etc., though.)

Tobias

Mime
View raw message