commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Spero <sesunc...@gmail.com>
Subject Re: [CSV] CSVMutableRecord
Date Thu, 17 Aug 2017 21:04:14 GMT
On Aug 15, 2017 8:01 PM, "Gilles" <gilles@harfang.homelinux.org> wrote:

Saying that making record mutable is "breaking" is a bit unfair when we do
> NOT document the mutability of the class in the first place.
>

I'm stating a fact: class is currently immutable, change would make it
mutable; it is functionally breaking.
I didn't say that you are forbidden to do it; just that it would be unwise,
particularly if it would be to save a few bytes.


Exactly.

TL;DR. This is almost always a breaking semantic change; the safest  ways
of implementing it are binary breaking; it's unlikely to have a major
performance impact; it might be better to create a new API module for
enhancements, with current package as legacy or implementation.

If a class previously exposed no mutators, adding one is usually a major
change. This is especially true for final classes, but it still affects use
cases where an instance is owned by another class, which may rely on the
lack of mutability to avoid making defensive copies.
Of course, a final class that has a  package-private getter  to a shared
copy of its backing array could be considered to be sending mixed
messages...

It is possible that a mutable class might have significant performance
advantages over an immutable one beyond saving a few bytes. For example, if
the updates are simple, and depend on the previous value of the cell, then
a mutable version might have better cache behavior. If there's other
sources of cache pressure this might have a higher than expected impact.
The costs of copying the original values might also be relatively
significant.

For an ETL use case these issues are unlikely to be limiting factors; for a
start, there's a non-zero chance that a  CSVRecord was extracted  by
parsing a CSV file. Also a transform will require conversion to some sort
of Number (or String allocation).

The current API doesn't easily support adding alternate implementations of
the relevant types. Implementation classes are final, and important  return
types are concrete.

One solution might be to treat the current code as almost an implementation
module, define a separate API module, and add extra interfaces and
alternate  implementations to support  the target use case (mutable
records, streams, reactivex, transform functions or what have you).

Simon

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message