commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Herbert <alex.d.herb...@gmail.com>
Subject [rng] Update ProviderBuilder factory methods
Date Mon, 03 Jun 2019 14:48:58 GMT
Can I get a review of a PR that changes the ProviderBuilder [1]?

The aim is to move the creation of the seed and the RNG into the 
RandomSourceInternal.

This allows for customisation of the seed creation on a per-RNG basis. 
The reason is that some new generators in the library for v1.3 require 
the seed is non-zero. For example this is a realistic possibility when 
the seed is an array of ints of size 2.

The code change has required updating the routines using the 
SeedConverter interface and adding a Seed2ArrayConverter interface:

public interface Seed2ArrayConverter<IN, OUT> extends SeedConverter<IN, OUT> {
     /**
      * Converts seed from input type to output type. The output type is expected to be an
array.
      *
      * @param seed Original seed value.
      * @param outputSize Output size.
      * @return the converted seed value.
      */
     OUT convert(IN seed, int outputSize);
}

I've moved conversion to a new enum class for seed creation and 
conversion. Previously the ProviderBuilder used maps to store all the 
conversions. These are now explicitly written as conversion methods in 
the enum. The amount of code is the same and the conversions are the 
same. However use of the enum for conversion has removed the need to 
support the Seed2ArrayConverter interface in all the converters. This 
has deprecated some methods and classes. I've not yet marked them as so 
in the PR.

i.e.

private static final Map<Class<?>, SeedConverter<long[], ?>> CONV_LONG_ARRAY
=
     new ConcurrentHashMap<Class<?>, SeedConverter<long[], ?>>();

does not have to become:

private static final Map<Class<?>, Seed2ArrayConverter<long[], ?>> CONV_LONG_ARRAY
=
     new ConcurrentHashMap<Class<?>, Seed2ArrayConverter<long[], ?>>();

with the corresponding conversion as:

nativeSeed = CONV_LONG_ARRAY.get(source.getSeed()).convert((long[]) seed, source.getSeedSize());


Currently conversions of arrays to arrays ignore the seed size. Creation 
of a new seed is limited to a maximum of 128. This matches the previous 
functionality. It could be changed to create the full length seed. The 
impact would be more work done within synchronized blocks in the 
SeedFactory. I would expect the generation to be slower but the seed 
quality will be higher.

Only creation of arrays from int/long seeds will use the seed size. This 
does not operate within a synchronized block.


Note that RNG constructor is obtained using reflection. Previously was 
done on each invocation. However now that the method is within the 
RandomSourceInternal enum caching the constructor is a natural 
modification since it is always the same. I have not done this in the 
constructor for the RandomSourceInternal enum as:

- it adds overhead when building all the enums (e.g. 
RandomSourceInternal.values())

- it may throw lots of different types of exception. Rather than 
catching them all in the constructor for the enum they can now be thrown 
during creation of the RNG instance so the user gets the appropriate 
stack trace.


The change to create an array using the correct size has performance 
implications (see [2]):

- For small array sizes the creation is faster

- For large array sizes built using an int/long the creation is 
marginally slower (but the seed should be better as it uses the SplitMix 
algorithm rather than the self-seeding strategy of the BaseProvider [3])


Caching the constructor for use with reflection has improved performance.


[1] https://github.com/apache/commons-rng/pull/46

[2] https://issues.apache.org/jira/browse/RNG-75

[3] This could be tested by creating a new generator that implements the 
self-seeding strategy as its output and running it through the stress 
test applications.



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message