commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Spoor <apa...@icemanx.nl>
Subject Re: [IO] Improvements to CharSequenceReader
Date Sat, 10 Aug 2019 13:03:05 GMT
Hi Gary,

I just created two PRs. #90 is for the use of getChars only. #91 was 
built from that one, and adds the sub sequence support. In the latter I 
have added a test for de-serializing a CharSequenceReader that I 
serialized before adding any new code, to make sure this still works.

Rob


On 09/08/2019 23:18, Gary Gregory wrote:
> Hi Rob,
> 
> Do you plan on creating a PR?
> 
> Gary
> 
> On Thu, Aug 1, 2019 at 7:00 AM Rob Spoor <apache@icemanx.nl> wrote:
> 
>> On 01/08/2019 12:31, Rob Spoor wrote:
>>> Hi,
>>>
>>> CharSequenceReader is great, but I think there can be two improvements:
>>>
>>> 1) read(char[], int, int) currently calls read() several times, which
>>> delegates to charSequence.charAt. That's fine in Java 8 and before, but
>>> in Java 9 the internal storage of String, StringBuilder and StringBuffer
>>> is changed. It's better to use getChars. For instance, after the
>>> length/offset check:
>>>
>>>           if (charSequence instanceof String) {
>>>               int count = Math.min(length, charSequence.length() - idx);
>>>               ((String) charSequence).getChars(idx, idx + count, array,
>>> offset);
>>>               return count;
>>>           }
>>>           // similar for StringBuilder and StringBuffer
>>>           // existing code
>>
>> Small fix: before returning count, idx += count should be added.
>>
>>
>>> 2) It would be nice to be able to create a CharSequenceReader for only a
>>> portion of the CharSequence. This prevents the need of having to call
>>> subSequence, which creates a whole new String for StringBuilder and
>>> StringBuffer.
>>> The class would get two extra int fields, start and end:
>>> * start is only used in close() to reset idx and mark to start instead
>> of 0
>>> * end is used as upper bound instead of charSequence.length()
>>>
>>> To prevent IndexOutOfBoundsExceptions if the CharSequence is shrunk, a
>>> private method end() could be used, which returns Math.min(end,
>>> charSequence.length()). The original constructor could then set start
>>> and end to 0 and Integer.MAX_VALUE respectively. One or two constructors
>>> would need to be added to support setting the start and end, and
>>> possibly only the start.
>>>
>>> One thing that may make this difficult is serialization.
>>> CharSequenceReader implements Serializable, so it should first be
>>> investigated if this doesn't break serialization.
>>>
>>>
>>> Kind regards,
>>>
>>> Rob

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message