spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "angerszhu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-27442) ParquetFileFormat fails to read column named with invalid characters
Date Thu, 25 Jul 2019 01:55:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-27442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892348#comment-16892348
] 

angerszhu commented on SPARK-27442:
-----------------------------------

Meet same problem, can't write parquet file with name such as 'max(t)'

`Caused by: org.apache.spark.sql.AnalysisException: Attribute name "max(t_when)" contains
invalid character(s) among " ,;{}()\n\t=". Please use alias to rename it.;`

> ParquetFileFormat fails to read column named with invalid characters
> --------------------------------------------------------------------
>
>                 Key: SPARK-27442
>                 URL: https://issues.apache.org/jira/browse/SPARK-27442
>             Project: Spark
>          Issue Type: Bug
>          Components: Input/Output
>    Affects Versions: 2.0.0, 2.4.1
>            Reporter: Jan Vršovský
>            Priority: Minor
>
> When reading a parquet file which contains characters considered invalid, the reader
fails with exception:
> Name: org.apache.spark.sql.AnalysisException
> Message: Attribute name "..." contains invalid character(s) among " ,;{}()\n\t=". Please
use alias to rename it.
> Spark should not be able to write such files, but it should be able to read it (and allow
the user to correct it). However, possible workarounds (such as using alias to rename the
column, or forcing another schema) do not work, since the check is done on the input.
> (Possible fix: remove superficial {{ParquetWriteSupport.setSchema(requiredSchema, hadoopConf)}}
from {{buildReaderWithPartitionValues}} ?)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message