commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles <gil...@harfang.homelinux.org>
Subject Re: [CSV] CSVMutableRecord
Date Fri, 18 Aug 2017 18:50:59 GMT
On Fri, 18 Aug 2017 12:41:01 -0600, Gary Gregory wrote:
> On Wed, Aug 16, 2017 at 6:28 PM, Gilles 
> <gilles@harfang.homelinux.org>
> wrote:
>
>> On Wed, 16 Aug 2017 11:27:53 -0600, Gary Gregory wrote:
>>
>>> Let's summarize the options:
>>>
>>> 0) do nothing
>>> 1) Add two put methods to CVSRecord making the class mutable
>>> 2) Add a "mutableRecord" boolean option to CVSRecord and CSVFormat 
>>> such
>>> that a new boolean in CVSRecord allow method from 1) above to 
>>> either work
>>> or throw an exception.
>>> 3) Add a "mutableRecord" boolean option to CVSRecord and CSVFormat 
>>> such
>>> that subclass of CVSRecord called CVSMutableRecord is created which
>>> contains two new put methods
>>>
>>> What else?
>>>
>>
>> 4) The factory method:
>>  /**
>>   * @param orig Original to be copied.
>>   * @param replace Fields to be replaced.
>>   * @return a copy of "orig", except for the fields in "replace".
>>   */
>>  public static CSVRecord createRecord(CSVRecord orig,
>>                                       Pair<Integer, String> ... 
>> replace)
>>
>> Could also be:
>>  public static CSVRecord createRecord(CSVRecord orig,
>>                                       int[] replaceIndices,
>>                                       String[] replaceValues)
>
>
> To me, that feels like bending over backwards and hard to very ugly 
> to use,
> building an array of ints, building an array of Strings. Yikes. But, 
> hey
> that's just me.

What about the "Pair" alternative above?

Another alternative would be to have a "CSVRecordBuilder" (with "put"
methods):
---CUT---
CSVRecord rec = readRecord(source);
rec = new CSVRecordBuilder(rec).put(1, "Change").put(4, 
"this").build();
---CUT---


Gilles

>
> Gary
>
>>
>>
>> Gilles
>>
>>
>>
>>> I like the simplest option: 1.
>>>
>>> Gary
>>>
>>> On Tue, Aug 15, 2017 at 6:01 PM, Gilles 
>>> <gilles@harfang.homelinux.org>
>>> wrote:
>>>
>>> On Tue, 15 Aug 2017 17:43:26 -0600, Gary Gregory wrote:
>>>>
>>>> On Tue, Aug 15, 2017 at 5:32 PM, Gilles 
>>>> <gilles@harfang.homelinux.org>
>>>>> wrote:
>>>>>
>>>>> On Tue, 15 Aug 2017 22:52:32 +0000, nitin mahendru wrote:
>>>>>
>>>>>>
>>>>>> On Tue, 15 Aug 2017 12:02:20 -0600, Gary Gregory wrote:
>>>>>>
>>>>>>>
>>>>>>> On Tue, Aug 15, 2017 at 10:38 AM, nitin mahendru
>>>>>>>
>>>>>>>> <nitin.mahendru88@gmail.com
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> How about having a state in the class itself which says
that 
>>>>>>>>> it's
>>>>>>>>
>>>>>>>> mutable
>>>>>>>>> or not.
>>>>>>>>> If we call a setter on an immutable then it throws an

>>>>>>>>> exception.
>>>>>>>>> By default the records are immutable and you need to
make 
>>>>>>>>> them
>>>>>>>>> mutable
>>>>>>>>> using a new API.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> A code example would be useful...
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Below is the pull request I added.
>>>>>>> https://github.com/apache/commons-csv/pull/21
>>>>>>>
>>>>>>>
>>>>>>> As I indicated in the previous message, this is functionally
>>>>>> breaking. [I'm diverting this discussion over to the "dev"
>>>>>> mailing list.]
>>>>>>
>>>>>>
>>>>>> Saying that making record mutable is "breaking" is a bit unfair 
>>>>>> when
>>>>> we do
>>>>> NOT document the mutability of the class in the first place.
>>>>>
>>>>>
>>>> I'm stating a fact: class is currently immutable, change
>>>> would make it mutable; it is functionally breaking.
>>>> I didn't say that you are forbidden to do it; just that
>>>> it would be unwise, particularly if it would be to save
>>>> a few bytes.
>>>>
>>>> Gilles
>>>>
>>>>
>>>>
>>>> Gary
>>>>>
>>>>>
>>>>>
>>>>> The following should be an interesting read:
>>>>>>   http://markmail.org/message/6ytvmxvy2ndsfp7h
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Gilles
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 15, 2017 at 11:17 AM Gilles 
>>>>>> <gilles@harfang.homelinux.org>
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>> On Tue, 15 Aug 2017 12:02:20 -0600, Gary Gregory wrote:
>>>>>>>
>>>>>>> > On Tue, Aug 15, 2017 at 10:38 AM, nitin mahendru
>>>>>>>> > <nitin.mahendru88@gmail.com
>>>>>>>> >> wrote:
>>>>>>>> >
>>>>>>>> >> How about having a state in the class itself which
says 
>>>>>>>> that it's
>>>>>>>> >> mutable
>>>>>>>> >> or not.
>>>>>>>> >> If we call a setter on an immutable then it throws
an 
>>>>>>>> exception.
>>>>>>>> >> By default the records are immutable and you need
to make 
>>>>>>>> them
>>>>>>>> >> mutable
>>>>>>>> >> using a new API.
>>>>>>>>
>>>>>>>> A code example would be useful...
>>>>>>>>
>>>>>>>> >> pros: Saves memory, Keeps the immutability benefits
>>>>>>>>
>>>>>>>> What kind of usage are you considering that a single transient
>>>>>>>> record matters (as compared to the ~300 MB of the JVM 
>>>>>>>> itself...)?
>>>>>>>>
>>>>>>>> >> cons: people using "mutable" records need to be

>>>>>>>> careful.(While
>>>>>>>> >> threading
>>>>>>>> >> maybe)
>>>>>>>> >>
>>>>>>>> >
>>>>>>>> > Interesting idea!
>>>>>>>> >
>>>>>>>> > But I think I like the idea of a subclass better if
we are 
>>>>>>>> going to
>>>>>>>> > split
>>>>>>>> > the behavior b/w mutable and immutable.
>>>>>>>>
>>>>>>>> Once you have a subclass that is able to modify the state
of
>>>>>>>> its parent, it's a mutable object. Period.
>>>>>>>> There is no such thing as a "split".
>>>>>>>>
>>>>>>>> >
>>>>>>>> > For my money and the KISS principle, I would just add
the 
>>>>>>>> put
>>>>>>>> method
>>>>>>>> > in
>>>>>>>> > CSVRecord.
>>>>>>>>
>>>>>>>> Then, any use that assumes immutability will be broken.
>>>>>>>>
>>>>>>>>
>>>>>>>> Gilles
>>>>>>>>
>>>>>>>>
>>>>>>>> > Gary
>>>>>>>> >
>>>>>>>> >>
>>>>>>>> >> -Nitin
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> On Tue, Aug 15, 2017 at 9:01 AM Gilles
>>>>>>>> >> <gilles@harfang.homelinux.org>
>>>>>>>> >> wrote:
>>>>>>>> >>
>>>>>>>> >> > On Tue, 15 Aug 2017 09:49:04 -0600, Gary Gregory
wrote:
>>>>>>>> >> > > That looks odd to me. What comes up for
me is the use 
>>>>>>>> case
>>>>>>>> where
>>>>>>>> >> I
>>>>>>>> >> > > want to
>>>>>>>> >> > > ETL a file of 10,000,000 records and update,
say, one 
>>>>>>>> column.
>>>>>>>> If
>>>>>>>> >> am
>>>>>>>> >> > > forced
>>>>>>>> >> > > to create a brand new record for every
record read, 
>>>>>>>> that would
>>>>>>>> >> be a
>>>>>>>> >> > > shame.
>>>>>>>> >> >
>>>>>>>> >> > Why?
>>>>>>>> >> >
>>>>>>>> >> > > If I had a mutable record, I could just
keep on 
>>>>>>>> updating it
>>>>>>>> and
>>>>>>>> >> using
>>>>>>>> >> > > it to
>>>>>>>> >> > > write each row. Read record, update it,
write record. 
>>>>>>>> No extra
>>>>>>>> >> memory
>>>>>>>> >> > > needed.
>>>>>>>> >> >
>>>>>>>> >> > How is the size of 1 additional record going
to matter 
>>>>>>>> compared
>>>>>>>> to
>>>>>>>> >> the
>>>>>>>> >> > size of the whole program?
>>>>>>>> >> >
>>>>>>>> >> > > Either we can make the current record
mutable (what's 
>>>>>>>> the
>>>>>>>> harm?)
>>>>>>>> >> or
>>>>>>>> >> > > we can
>>>>>>>> >> > > make the parser serve out mutable records
based on a 
>>>>>>>> config
>>>>>>>> >> setting.
>>>>>>>> >> > > This
>>>>>>>> >> > > could be a subclass of CSVRecord with
the extra method 
>>>>>>>> I
>>>>>>>> >> proposed.
>>>>>>>> >> >
>>>>>>>> >> > The harm is that you loose all the promises
of 
>>>>>>>> immutability.
>>>>>>>> >> >
>>>>>>>> >> > Regards,
>>>>>>>> >> > Gilles
>>>>>>>> >> >
>>>>>>>> >> > >
>>>>>>>> >> > > Thoughts?
>>>>>>>> >> > >
>>>>>>>> >> > > Gary
>>>>>>>> >> > >
>>>>>>>> >> > > On Tue, Aug 15, 2017 at 8:33 AM, Gilles
>>>>>>>> >> > > <gilles@harfang.homelinux.org>
>>>>>>>> >> > > wrote:
>>>>>>>> >> > >
>>>>>>>> >> > >> On Tue, 15 Aug 2017 08:01:53 -0600,
Gary Gregory 
>>>>>>>> wrote:
>>>>>>>> >> > >>
>>>>>>>> >> > >>> How does that work when you want
to change more than 
>>>>>>>> one
>>>>>>>> >> value?
>>>>>>>> >> > >>>
>>>>>>>> >> > >>
>>>>>>>> >> > >> How about a "vararg" argument:
>>>>>>>> >> > >>
>>>>>>>> >> > >> /**
>>>>>>>> >> > >>  * @param orig Original to be copied.
>>>>>>>> >> > >>  * @param replace Fields to be replaced.
>>>>>>>> >> > >>  */
>>>>>>>> >> > >> public static CSVRecord createRecord(CSVRecord
orig,
>>>>>>>> >> > >>                                  
   Pair<Integer, 
>>>>>>>> String>
>>>>>>>> ...
>>>>>>>> >> > >> replace) {
>>>>>>>> >> > >>     // ...
>>>>>>>> >> > >> }
>>>>>>>> >> > >>
>>>>>>>> >> > >>
>>>>>>>> >> > >> Gilles
>>>>>>>> >> > >>
>>>>>>>> >> > >>
>>>>>>>> >> > >>
>>>>>>>> >> > >>> Gary
>>>>>>>> >> > >>>
>>>>>>>> >> > >>> On Aug 15, 2017 00:17, "Benedikt
Ritter" <
>>>>>>>> britter@apache.org>
>>>>>>>> >> > >>> wrote:
>>>>>>>> >> > >>>
>>>>>>>> >> > >>> Hi,
>>>>>>>> >> > >>>>
>>>>>>>> >> > >>>> I very much like that CSVRecord
is unmodifiable. So 
>>>>>>>> I’d
>>>>>>>> >> suggest an
>>>>>>>> >> > >>>> API,
>>>>>>>> >> > >>>> that creates a new record
instead of mutating the 
>>>>>>>> existing
>>>>>>>> >> one:
>>>>>>>> >> > >>>>
>>>>>>>> >> > >>>> CSVRecord newRecord = myRecord.put(1,
„value")
>>>>>>>> >> > >>>>
>>>>>>>> >> > >>>> I’m not sure about „put“
as a method name since it 
>>>>>>>> clashes
>>>>>>>> >> with
>>>>>>>> >> > >>>> java.util.Map#put, which is
mutation based...
>>>>>>>> >> > >>>>
>>>>>>>> >> > >>>> Regards,
>>>>>>>> >> > >>>> Benedikt
>>>>>>>> >> > >>>>
>>>>>>>> >> > >>>> > Am 15.08.2017 um 02:54
schrieb Gary Gregory
>>>>>>>> >> > >>>> <garydgregory@gmail.com>:
>>>>>>>> >> > >>>> >
>>>>>>>> >> > >>>> > Feel free to provide
a PR on GitHub :-)
>>>>>>>> >> > >>>> >
>>>>>>>> >> > >>>> > Gary
>>>>>>>> >> > >>>> >
>>>>>>>> >> > >>>> > On Aug 14, 2017 15:29,
"Gary Gregory"
>>>>>>>> >> <garydgregory@gmail.com>
>>>>>>>> >> > >>>> wrote:
>>>>>>>> >> > >>>> >
>>>>>>>> >> > >>>> >> I think we've kept
the design as YAGNI as 
>>>>>>>> possible...
>>>>>>>> :-)
>>>>>>>> >> > >>>> >>
>>>>>>>> >> > >>>> >> Gary
>>>>>>>> >> > >>>> >>
>>>>>>>> >> > >>>> >> On Mon, Aug 14, 2017
at 3:25 PM, nitin mahendru <
>>>>>>>> >> > >>>> >> nitin.mahendru88@gmail.com>
wrote:
>>>>>>>> >> > >>>> >>
>>>>>>>> >> > >>>> >>> Yeah that also
is OK. I though there is a reason 
>>>>>>>> to
>>>>>>>> keep
>>>>>>>> >> the
>>>>>>>> >> > >>>> CSVRecord
>>>>>>>> >> > >>>> >>> without setters.
But maybe not!
>>>>>>>> >> > >>>> >>>
>>>>>>>> >> > >>>> >>> Nitin
>>>>>>>> >> > >>>> >>>
>>>>>>>> >> > >>>> >>>
>>>>>>>> >> > >>>> >>>
>>>>>>>> >> > >>>> >>>
>>>>>>>> >> > >>>> >>> On Mon, Aug 14,
2017 at 2:22 PM Gary Gregory
>>>>>>>> >> > >>>> <garydgregory@gmail.com
>>>>>>>> >> > >>>> >
>>>>>>>> >> > >>>> >>> wrote:
>>>>>>>> >> > >>>> >>>
>>>>>>>> >> > >>>> >>>> Hi All:
>>>>>>>> >> > >>>> >>>>
>>>>>>>> >> > >>>> >>>> Should we
consider adding put(int,Object) and
>>>>>>>> >> put(String,
>>>>>>>> >> > >>>> Object) to
>>>>>>>> >> > >>>> the
>>>>>>>> >> > >>>> >>>> current CSVRecord
class?
>>>>>>>> >> > >>>> >>>>
>>>>>>>> >> > >>>> >>>> Gary
>>>>>>>> >> > >>>> >>>>
>>>>>>>> >> > >>>> >>>> On Mon, Aug
14, 2017 at 2:54 PM, nitin mahendru 
>>>>>>>> <
>>>>>>>> >> > >>>> >>>> nitin.mahendru88@gmail.com>
>>>>>>>> >> > >>>> >>>> wrote:
>>>>>>>> >> > >>>> >>>>
>>>>>>>> >> > >>>> >>>>> Hi Everyone,
>>>>>>>> >> > >>>> >>>>>
>>>>>>>> >> > >>>> >>>>> I recently
pushed a change(pull request 20) to 
>>>>>>>> get
>>>>>>>> the
>>>>>>>> >> line
>>>>>>>> >> > >>>> ending
>>>>>>>> >> > >>>> >>> from
>>>>>>>> >> > >>>> >>>> the
>>>>>>>> >> > >>>> >>>>> parser.
>>>>>>>> >> > >>>> >>>>>
>>>>>>>> >> > >>>> >>>>> Now I
want to push another change which I feel 
>>>>>>>> will
>>>>>>>> >> also be
>>>>>>>> >> > >>>> useful
>>>>>>>> >> > >>>> for
>>>>>>>> >> > >>>> >>>> the
>>>>>>>> >> > >>>> >>>>> community.
I want to add a CSVRecordMutable 
>>>>>>>> class
>>>>>>>> which
>>>>>>>> >> had
>>>>>>>> >> > >>>> a
>>>>>>>> >> > >>>> >>> constructor
>>>>>>>> >> > >>>> >>>>> which
accepts a CSVRecord object. So when we 
>>>>>>>> have a
>>>>>>>> >> > >>>> CSVRecordMutable
>>>>>>>> >> > >>>> >>>> object
>>>>>>>> >> > >>>> >>>>> from
it then we can edit individual columns 
>>>>>>>> using it.
>>>>>>>> >> > >>>> >>>>>
>>>>>>>> >> > >>>> >>>>> I would
be using this to write back my edited 
>>>>>>>> CSV
>>>>>>>> file.
>>>>>>>> >> My
>>>>>>>> >> > >>>> use case
>>>>>>>> >> > >>>> >>> is to
>>>>>>>> >> > >>>> >>>>> read
a csv, mangle some columns, write back a 
>>>>>>>> new
>>>>>>>> csv.
>>>>>>>> >> > >>>> >>>>>
>>>>>>>> >> > >>>> >>>>> I could
have directly raised a pull request 
>>>>>>>> but I
>>>>>>>> just
>>>>>>>> >> > >>>> wanted to
>>>>>>>> >> > >>>> float
>>>>>>>> >> > >>>> >>>> the
>>>>>>>> >> > >>>> >>>>> idea
before and see the reaction.
>>>>>>>> >> > >>>> >>>>>
>>>>>>>> >> > >>>> >>>>> Thanks
>>>>>>>> >> > >>>> >>>>>
>>>>>>>> >> > >>>> >>>>> Nitin
>>>>>>>> >> > >>>> >>>>>
>>>>>>>> >> > >>>> >>>>
>>>>>>>> >> > >>>> >>>
>>>>>>>> >> > >>>> >>
>>>>>>>> >> > >>>> >>
>>>>>>>> >> > >>>>
>>>>>>>> >> > >>>>
>>>>>>>> >> > >>>>
>>>>>>>> >> > >>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>
>>
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message