spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <tathagata.das1...@gmail.com>
Subject Re: Structured Streaming Schema Issue
Date Wed, 01 Feb 2017 23:29:07 GMT
You should make sure that schema of the streaming Dataset returned by
`readStream`, and the schema of the DataFrame returned by the sources
getBatch.

On Wed, Feb 1, 2017 at 3:25 PM, Sam Elamin <hussam.elamin@gmail.com> wrote:

> Hi All
>
> I am writing a bigquery connector here
> <http://github.com/samelamin/spark-bigquery> and I am getting a strange
> error with schemas being overwritten when a dataframe is passed over to the
> Sink
>
>
> for example the source returns this StructType
> WARN streaming.BigQuerySource: StructType(StructField(
> customerid,LongType,true),
>
> and the sink is recieving this StructType
> WARN streaming.BigQuerySink: StructType(StructField(
> customerid,StringType,true)
>
>
> Any idea why this might be happening?
> I dont have infering schema on
>
> spark.conf.set("spark.sql.streaming.schemaInference", "false")
>
> I know its off by default but I set it just to be sure
>
> So completely lost to what could be causing this
>
> Regards
> Sam
>

Mime
View raw message