spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yin Huai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-6123) Parquet reader should use the schema of every file to create converter
Date Tue, 03 Mar 2015 01:41:04 GMT

    [ https://issues.apache.org/jira/browse/SPARK-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344278#comment-14344278
] 

Yin Huai commented on SPARK-6123:
---------------------------------

To workaround this issue, users need to load the existing data (the one with containsNull=false)
and write the data to a new dir.

> Parquet reader should use the schema of every file to create converter
> ----------------------------------------------------------------------
>
>                 Key: SPARK-6123
>                 URL: https://issues.apache.org/jira/browse/SPARK-6123
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Yin Huai
>
> For two parquet files for the same table having an array column, if values of the array
in one file was created when containsNull was true and those in another file was created when
containsNull was false, the containsNull in the merged schema will be true and we cannot correctly
read data from the table created with containsNull=false. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message