commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benedikt Ritter <benerit...@gmail.com>
Subject Re: [CSV] Add dependency to Commons IO and CSV-107
Date Wed, 25 Jun 2014 19:28:11 GMT
Hi,

I agree with sebb.

Users can still create a BOMInputStream, wrap it into a reader and pass it
to CSVParser.parse(final Reader reader, final CSVFormat format).

I'm open for moving ExtendedBufferedReader to IO, if it fits. But in this
case I would probably use the shade plugin to get it into CSV again.

just my 2 cents
Benedikt


2014-06-19 18:00 GMT+02:00 sebb <sebbaz@gmail.com>:

> On 19 June 2014 15:00, Gary Gregory <garydgregory@gmail.com> wrote:
> > To support https://issues.apache.org/jira/browse/CSV-107, it would make
> > life easy to depend on Commons IO to use BOMInputStream and the classes
> it
> > depends on instead of copying them to [csv].
> >
> > I think we need to deal with BOMs in [csv] because casual users may not
> > recognize the problem and using a BOMInputStream or other workaround is
> not
> > trivial to find. First recognizing that the stream has a thing called a
> BOM
> > and secondly finding a clean way to deal with said BOM. In addition,
> there
> > are different kinds of BOMs with different sizes to deal with.
>
> This seems out of scope for CSV to me.
>
> However, if it is decided to add this, then I think it needs to be
> added to ALL file readers.
> Otherwise, the user will need to know about the BOM in advance.
> In which case they can add their own code plus dependency to handle it
> (as is done in the CSVParserTest#testBOMInputStream() method now.
>
> What appears to happen with the test case is that the BOM is included
> as part of the first column header name, so the testBOM() unit test
> fails because the "Date" column is not present.
>
>
> > I am fine with adding this dependency. Other Commons component depend on
> > others.
>
> I don't think the use-case warrants adding the dependency.
>
> I would prefer CSV to report some kind of error if a BOM is seen in
> the input file.
>
> AFAICT the BOM is stored at the start of the first record, so it
> should be possible to detect a BOM input file by looking for the
> relevant bytes.
>
> > We can then also talk about whether ExtendedBufferedReader is generic
> > enough to move to [io].
> >
> > Gary
> >
> > --
> > E-Mail: garydgregory@gmail.com | ggregory@apache.org
> > Java Persistence with Hibernate, Second Edition
> > <http://www.manning.com/bauer3/>
> > JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
> > Spring Batch in Action <http://www.manning.com/templier/>
> > Blog: http://garygregory.wordpress.com
> > Home: http://garygregory.com/
> > Tweet! http://twitter.com/GaryGregory
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message