commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Work logged] (CSV-226) Add CSVParser test case for standard charsets
Date Fri, 01 Mar 2019 01:19:00 GMT


ASF GitHub Bot logged work on CSV-226:

                Author: ASF GitHub Bot
            Created on: 01/Mar/19 01:18
            Start Date: 01/Mar/19 01:18
    Worklog Time Spent: 10m 
      Work Description: aeschwabe commented on issue #30: [CSV-226] Add CSVParser test case
for standard charsets.
   I think I'll close this PR and break up some of the files into two separate issues.   I'll
ping jira and resubmit when ready.
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

Issue Time Tracking

    Worklog Id:     (was: 206171)
    Time Spent: 20m  (was: 10m)

> Add CSVParser test case for standard charsets
> ---------------------------------------------
>                 Key: CSV-226
>                 URL:
>             Project: Commons CSV
>          Issue Type: Test
>          Components: Parser
>    Affects Versions: 1.5
>            Reporter: Anson Schwabecher
>            Priority: Minor
>          Time Spent: 20m
>  Remaining Estimate: 0h
> Hello, I'd like to contribute a CSVParser test suite for standard charsets as defined
in java.nio.charset.StandardCharsets + UTF-32.
> This is a standalone test but is also in support of a fix for CSV-107.  It also refactors
and unifies the testing around your established workaround of inserting BOMInputStream ahead
of the CSVParser.
> It will take a single base UTF-8 encoded file (cstest.csv) and copy it to multiple output
files (in target dir) with differing character sets, similar to the iconv tool.  Each file
will then be fed into the parser to test all the BOM/NOBOM unicode variants.  I think a file
based approach is still important here rather than just encoding a character stream inline
as a string, that way if issues develop it's easy to inspect the data.
> I noticed in the project’s pom.xml (rat config) that you are excluding individual test
resource files by name rather than using a wildcard expression to exclude every file in the
directory.  Is there a reason for this? It’s much better if devs do not have to maintain
this configuration.
> {code:language=xml|title=i.e.: switch over to a single exclude expression}
> <exclude>src/test/resources/**/*</exclude>
> {code}

This message was sent by Atlassian JIRA

View raw message