spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ankit Raj Boudh (Jira)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-29437) CSV Writer should escape 'escapechar' when it exists in the data
Date Tue, 15 Oct 2019 03:17:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-29437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951525#comment-16951525
] 

Ankit Raj Boudh commented on SPARK-29437:
-----------------------------------------

Hi [~kretes],

This is the correct behaviour, as per my analysis. 

As per your suggestion i tried to escaped  escape character (default '\') but it will
be a problem during storing same data 

inside view.

> CSV Writer should escape 'escapechar' when it exists in the data
> ----------------------------------------------------------------
>
>                 Key: SPARK-29437
>                 URL: https://issues.apache.org/jira/browse/SPARK-29437
>             Project: Spark
>          Issue Type: Bug
>          Components: Input/Output
>    Affects Versions: 2.4.3
>            Reporter: Tomasz Bartczak
>            Priority: Trivial
>
> When the data contains escape character (default '\') it should either be escaped or
quoted.
> Steps to reproduce: [https://gist.github.com/kretes/58f7f66a0780681a44c175a2ac3c0da2]
>  
> The effect can be either bad data read or sometimes even unable to properly read the
csv, e.g. when escape character is the last character in the column - it break the column
reading for that row and effectively break e.g. type inference for a dataframe



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message