spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aliaksandr Bedrytski <meano...@gmail.com>
Subject Re: How does .jsonFile() work?
Date Thu, 28 Apr 2016 16:38:25 GMT
If your question is about how the schema is inferred for JSON,
the paragraph 5.1 from this paper
https://amplab.cs.berkeley.edu/wp-content/uploads/2015/03/SparkSQLSigmod2015.pdf

explains it quite well (long story short, Spark tries to find
the most specific type for the field, otherwise it is a string)

On Thu, Apr 28, 2016 at 5:53 PM harjitdotsingh <singh.harjit@gmail.com>
wrote:

> From what I know and what I have played with, jsonFile reads JsonRecords
> which are defined as one record per line. Its not always the case that you
> can supply the data that way. If you have custom data json data where you
> cannot define a record per line, you will have to write your own
> customReceiver to receive the data and then parse it. I hope it makes
> sense.
> I wrote my own handler to read directory and that directory contained json
> files, I read until I have hit the EOF and then later call the store method
> which then sends the data to your driver.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-does-jsonFile-work-tp26802p26844.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message