commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hoa Phan <>
Subject Re: [CSV] Release again
Date Wed, 03 Jul 2019 08:18:36 GMT

Sent from Yahoo Mail on Android 
  On Wed, Jul 3, 2019 at 18:13, Alex Herbert<> wrote:   

> On 15 Jun 2019, at 19:59, Alex Herbert <> wrote:
>> On 15 Jun 2019, at 18:57, Gary Gregory <> wrote:
>> We've fixed some issues immediately after 1.7. How does everyone feel about
>> releasing again?
>> What else needs to be addressed in the short term?
>> Gary
> - Bug (picked up by FindBugs):
> CSVRecord is no longer serialisable as it stores the CSVParser and that is not serialisable.
This can be fixed by making the parser transient and documenting the getParser() method as
invalid after deserialisation. Otherwise the cascade of fields that must be serialisable eventually
includes the original BufferedReader used to read the data. The parser is required for the
header map functionality and the getParser() method. So the class can store the header map
(easy), or overload the serialisation methods to record and load the header map (more effort),
or not support the parser and the header map functionality after deserialisation, or something
> This should have been picked up by FindBugs during the release cycle but for unknown
reasons the FindBugs report on the site does not show this. So anyone who deserialises this
class will get an error as the serialVersionUID has not been updated and the new parser field
is not serialisable.

I’ve raised this as a bug (CSV-248). It requires a decision on how to fix it.

> I also raised a few issues about the handling of column headers in my recent fix PR:
> - Bug:
> If the settings are not allowing empty columns headers you can currently use a single
empty header. This is because column headers are only checked for empty when they are duplicates.
So it is the second empty header (the first duplicate) that raises an error. This test should
pass but does not:
> // This fails to throw an exception.
> // Only 1 column name is missing and this is not recognised as a problem.
> @Test(expected = IllegalArgumentException.class)
> public void testHeadersMissing1ColumnException() throws Exception {
>    final Reader in = new StringReader("a,,c,d\n1,2,3,4\nx,y,z,zz");
>    CSVFormat.DEFAULT.withHeader().parse(in).iterator();
> }

I’ve raised this as a bug (CSV-247). It is a rearrangement of the checking logic and I will
raise a PR to fix it.

> - Documentation:
> If you are using the map of the header to column index it should be noted that under
certain circumstances where you allow duplicate column headers that the map cannot represent
the one-to-many relationship. The map is only guaranteed to be valid if the record was created
with a format that does not allow duplicate or empty header names.

Any objections to updating the Javadoc to record the limitations of the header map?

> - Bug / Documentation
> The CSVParser only stores headers names in a list of header names if they are not null.
So the list can be shorter than the number of columns if you use a format that allows empty
> If the purpose of the list is to provide the correct order of the column headers then
it should contain empty entries too. Otherwise I don’t see the purpose of storing this list
unless it is documented what it will contain. This appears to be a list of non-null header
names in the order they occur in the header and thus represent keys that can be used in the
header map. But you can get that (without the ordering) using the map ketSet.

Is this a bug or should it be documented as a feature?

I consider it a bug. The list of header names does not always reflect the header, i.e. it
should contain empty entries if the header was empty.


To unsubscribe, e-mail:
For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message