commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Herbert <alex.d.herb...@gmail.com>
Subject Re: [rng] Update ProviderBuilder factory methods
Date Wed, 05 Jun 2019 22:35:04 GMT


> On 5 Jun 2019, at 22:51, Gilles Sadowski <gilleseran@gmail.com> wrote:
> 
> Hi.
> 
> Le lun. 3 juin 2019 à 16:49, Alex Herbert <alex.d.herbert@gmail.com <mailto:alex.d.herbert@gmail.com>>
a écrit :
>> 
>> Can I get a review of a PR that changes the ProviderBuilder [1]?
>> 
>> The aim is to move the creation of the seed and the RNG into the
>> RandomSourceInternal.
>> 
>> This allows for customisation of the seed creation on a per-RNG basis.
>> The reason is that some new generators in the library for v1.3 require
>> the seed is non-zero. For example this is a realistic possibility when
>> the seed is an array of ints of size 2.
>> 
>> The code change has required updating the routines using the
>> SeedConverter interface and adding a Seed2ArrayConverter interface:
>> 
>> public interface Seed2ArrayConverter<IN, OUT> extends SeedConverter<IN,
OUT> {
>>     /**
>>      * Converts seed from input type to output type. The output type is expected
to be an array.
>>      *
>>      * @param seed Original seed value.
>>      * @param outputSize Output size.
>>      * @return the converted seed value.
>>      */
>>     OUT convert(IN seed, int outputSize);
>> }
>> 
>> I've moved conversion to a new enum class for seed creation and
>> conversion. Previously the ProviderBuilder used maps to store all the
>> conversions. These are now explicitly written as conversion methods in
>> the enum. The amount of code is the same and the conversions are the
>> same. However use of the enum for conversion has removed the need to
>> support the Seed2ArrayConverter interface in all the converters. This
>> has deprecated some methods and classes. I've not yet marked them as so
>> in the PR.
>> 
>> i.e.
>> 
>> private static final Map<Class<?>, SeedConverter<long[], ?>> CONV_LONG_ARRAY
=
>>     new ConcurrentHashMap<Class<?>, SeedConverter<long[], ?>>();
>> 
>> does not have to become:
>> 
>> private static final Map<Class<?>, Seed2ArrayConverter<long[], ?>>
CONV_LONG_ARRAY =
>>     new ConcurrentHashMap<Class<?>, Seed2ArrayConverter<long[], ?>>();
>> 
>> with the corresponding conversion as:
>> 
>> nativeSeed = CONV_LONG_ARRAY.get(source.getSeed()).convert((long[]) seed, source.getSeedSize());
>> 
>> 
>> Currently conversions of arrays to arrays ignore the seed size. Creation
>> of a new seed is limited to a maximum of 128.
> 
> +0
> See below.
> 
>> This matches the previous
>> functionality. It could be changed to create the full length seed.
> 
> +0
> See below.
> 
>> The
>> impact would be more work done within synchronized blocks in the
>> SeedFactory. I would expect the generation to be slower but the seed
>> quality will be higher.
> 
> As per the previous discussion, special needs should be covered
> by the user.
> However, it could be construed that 128 is large enough for a casual
> user, and a larger seed could also pass as a special need that the
> user can provide explicitly...
> 
> So either way is fine I guess.

So we stick to an upper limit of 128. I was just pointing out that there is no need for this
limit anymore since the actual length is known. However I agree that a size of 128 is pretty
big. Those generators with larger seeds will self generate the rest of the seed so the full
length seed is created, just not all from the same RNG.

> 
>> 
>> Only creation of arrays from int/long seeds will use the seed size. This
>> does not operate within a synchronized block.
>> 
>> 
>> Note that RNG constructor is obtained using reflection. Previously was
>> done on each invocation. However now that the method is within the
>> RandomSourceInternal enum caching the constructor is a natural
>> modification since it is always the same.
> 
> +1
> But please rename the local variable in method "getConstructor()".[1]
> ;-)

Sorry about that. 

I have a habit of avoiding name clashes between local and class variables. I chose poorly
in this case. I'll change it to something more verbose.

> 
>> I have not done this in the
>> constructor for the RandomSourceInternal enum as:
>> 
>> - it adds overhead when building all the enums (e.g.
>> RandomSourceInternal.values())
>> 
>> - it may throw lots of different types of exception. Rather than
>> catching them all in the constructor for the enum they can now be thrown
>> during creation of the RNG instance so the user gets the appropriate
>> stack trace.
>> 
>> 
>> The change to create an array using the correct size has performance
>> implications (see [2]):
>> 
>> - For small array sizes the creation is faster
>> 
>> - For large array sizes built using an int/long the creation is
>> marginally slower (but the seed should be better as it uses the SplitMix
>> algorithm rather than the self-seeding strategy of the BaseProvider [3])
> 
> Where is it done?

In the call to:

RandomSourceInternal.convertSeed(Object seed) {
    return nativeSeedType.convertSeed(seed, nativeSeedSize);
}

If the native seed type is an array and the input seed is Integer or Long then the nativeSeedSize
is used and a full length array is created by either:

NativeSeedType.INT_ARRAY
NativeSeedType.LONG_ARRAY 

Object convert(Integer seed, int size);
Object convert(Long seed, int size);

The NativeSeedType enum has all the convertors so for example for Long -> long[]:

@Override
protected int[] convert(Long seed, int size) {
    return LONG_TO_INT_ARRAY.convert(seed, size);
}

Where LONG_TO_INT_ARRAY is using a SplitMix to expand the input seed to an array.

> 
>> 
>> Caching the constructor for use with reflection has improved performance.
> 
> +1
> 
> Thanks,
> Gilles
> 
>> 
>> 
>> [1] https://github.com/apache/commons-rng/pull/46 <https://github.com/apache/commons-rng/pull/46>
>> 
>> [2] https://issues.apache.org/jira/browse/RNG-75 <https://issues.apache.org/jira/browse/RNG-75>
>> 
>> [3] This could be tested by creating a new generator that implements the
>> self-seeding strategy as its output and running it through the stress
>> test applications.
>> 
> 
> Gilles
> 
> [1] https://www.linguee.com/french-english/translation/con.html <https://www.linguee.com/french-english/translation/con.html>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org <mailto:dev-unsubscribe@commons.apache.org>
> For additional commands, e-mail: dev-help@commons.apache.org <mailto:dev-help@commons.apache.org>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message