commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Bruun-Hansen (Jira)" <j...@apache.org>
Subject [jira] [Created] (CSV-254) POSTGRESQL_CSV cannot parse correctly (null vs zero-length string)
Date Wed, 30 Oct 2019 19:26:00 GMT
Lars Bruun-Hansen created CSV-254:
-------------------------------------

             Summary: POSTGRESQL_CSV cannot parse correctly (null vs zero-length string)
                 Key: CSV-254
                 URL: https://issues.apache.org/jira/browse/CSV-254
             Project: Commons CSV
          Issue Type: Bug
            Reporter: Lars Bruun-Hansen


 
 PostgresSQL by default creates CSV files which uses absent value for database NULL values.
Consider the following test in PostgresSQL:
{code:sql}
CREATE TABLE mytab (
	col1   varchar(10),
	col2   int,
	col3   varchar(10)
);
INSERT INTO mytab VALUES ('AA', 33, null);
INSERT INTO mytab VALUES ('AA', null, '');
INSERT INTO mytab VALUES (null, 33, 'CC');
{code}
and then exporting it to CSV:
{code:sql}
\copy mytab TO STDOUT WITH csv
{code}
will produce CSV output as follows:
{noformat}
AA,33,
AA,,""
,33,CC
{noformat}
which the library currently will parse as follows using the current {{POSTGRESQL_CSV}} format:
{noformat}
"AA","33",null
"AA",null,null
null,"33","CC"
{noformat}
Row no 2 is incorrect when comparing to the actual database table contents.
h2. The fix

Therefore the declaration of {{POSTGRESQL_CSV}} must be changed to:
{code:java}
public static final CSVFormat POSTGRESQL_CSV = DEFAULT
            .withDelimiter(COMMA)
            .withEscape(DOUBLE_QUOTE_CHAR)
            .withIgnoreEmptyLines(false)
            .withQuote(DOUBLE_QUOTE_CHAR)
            .withRecordSeparator(LF)
            .withAbsentMeansNull(true)   // added
            .withNullString(EMPTY)
            .withQuoteMode(QuoteMode.ALL_NON_NULL);
{code}
(this depends on [PR51|https://github.com/apache/commons-csv/pull/51])

 With the above the parser will give the following result instead: 
{noformat}
"AA","33",null
"AA",null,""
null,"33","CC"
{noformat}
which is the expected result.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message