commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Herbert <alex.d.herb...@gmail.com>
Subject Re: [rng] SharedStateSampler
Date Fri, 19 Jul 2019 13:31:20 GMT
On 19/07/2019 14:15, Gilles Sadowski wrote:
> Hello.
>
> Le ven. 19 juil. 2019 à 14:27, Alex Herbert <alex.d.herbert@gmail.com> a écrit
:
>> One principle reason for SharedStateDiscreteSampler
>> and SharedStateContinuousSampler is to simplify the current design approach
>> for samplers that use internal delegates to provide optimised algorithms.
>> However this creates an inefficient outer class that is just wrapping the
>> implementation.
>>
>> The idea was to simplify the implementation of the SharedStateSampler<R>
>> interface for these Discrete/Continuous Samplers. However it has wider
>> application to moving away from public constructors to factory methods. The
>> SharedStateDiscreteSampler interface would allow the samplers package to
>> move to a factory constructor model where samplers have no public
>> constructor. This allows the factory method to choose the best
>> implementation. This idea is recommended by Joshua Block in Effective Java
>> (see [1]).
>>
>> For example:
>>
>> UniformRandomProvider rng = ...;
>> double mean = ...;
>> PoissonSampler sampler = new PoissonSampler(rng, mean);
>>
>> vs (with some potential names):
>>
>> SharedStateDiscreteSampler sampler =
>> PoissonSampler.newSharedStateDiscreteSampler(rng, mean);
>> SharedStateDiscreteSampler sampler = PoissonSampler.newInstance(rng, mean);
>> SharedStateDiscreteSampler sampler = PoissonSampler.newSampler(rng, mean);
>> SharedStateDiscreteSampler sampler = PoissonSampler.newPoissonSampler(rng,
>> mean);
>> SharedStateDiscreteSampler sampler = PoissonSampler.of(rng, mean);
>>
>> Currently there are some unreleased classes in v1.3 that use various
>> construction methods with different approaches to fitting within the
>> current design of returning an instance that supports DiscreteSampler and
>> SharedStateSampler<R> and using specialised algorithms:
>>
>> AliasMethodDiscreteSampler
>> MarsagliaTsangWangDiscreteSampler
>> GuideTableDiscreteSampler
>> GeometricSampler
>>
>> The PoissonSamplerCache could also benefit from this as it returns a
>> DiscreteSampler even though the returned object is now also a
>> SharedStateSampler<R>.
>>
>> I would advocate introducing these interfaces and switching to a factory
>> method design for unreleased code. This has no binary compatibility issues.
>>
>> Current released code that would also benefit from this design are those
>> with internal delegates:
>>
>> PoissonSampler
>> AhrensDieterMarsagliaTsangGammaSampler
>> LargeMeanPoissonSampler
>> ChengBetaSampler (no delegate but it chooses an algorithm)
>>
>> An example of factory code would be
>>
>> double alpha;
>> double theta
>> SharedStateContinuousSampler sampler =
>> AhrensDieterMarsagliaTsangGammaSampler.of(rng, alpha, theta);
>> SharedStateContinuousSampler sampler =
>> AhrensDieterMarsagliaTsangGammaSampler.newGammaSampler(rng, alpha, theta);
>>
>> Since the class name is so verbose in this case the 'of' method name does
>> not appear out of place.
>>
>> Note that as the SharedStateSampler<R> interface is implemented by all
>> (non-deprecated) distribution samplers a factory method could be added to
>> every sampler. This would pave the way for removal of public constructors
>> in a 2.0 release (whenever that happens).
>>
>> Overall the proposal is to:
>>
>> - Create SharedStateDiscreteSampler and SharedStateContinuousSampler
>> - Simplify the implementation of SharedStateSampler<R>
>> - Move to factory constructors for unreleased samplers
>> - Add factory constructors to existing samplers to return optimised
>> implementations
> +1 (using "of" as the method name).

OK. I will create a ticket.

Note that when I did SharedStateSampler I avoided this implementation to 
minimise changes. However I recently revisited the change to the 
DiscreteUniformSampler to use a new faster sampling method for a uniform 
range (RNG-95). The result was an implementation with 5 internal classes 
to handle ranges:

[n,n]
[n,m]
[0,n]

And specialisations for powers of 2. It works and is much faster than 
the current sampler but was not very clean. I'll update this when the 
new structure is in place.

I am happy with the name 'of' for most samplers since the sampler name 
contains all the information. What method name would you recommend for a 
sampler that can sample from different distributions? This is the class 
and existing methods:

MarsagliaTsangWangDiscreteSampler.createBinomialDistribution
MarsagliaTsangWangDiscreteSampler.createPoissonDistribution
MarsagliaTsangWangDiscreteSampler.createDiscreteDistribution

Leave it using 'create' or change to:

- 'new'
- 'for'
- 'of'

In this case 'of' still seems OK.

MarsagliaTsangWangDiscreteSampler.ofBinomialDistribution
MarsagliaTsangWangDiscreteSampler.ofPoissonDistribution
MarsagliaTsangWangDiscreteSampler.ofDiscreteDistribution

Perhaps I'll just make all the changes and the names can be assessed 
when the PR is ready. There may be others that I have missed that will 
turn up when changing the code.

Alex


>
> Gilles
>
>> Thoughts on this are welcomed.
>>
>> Alex
>>
>> [1] https://www.baeldung.com/java-constructors-vs-static-factory-methods
>>
>> On Fri, 19 Jul 2019 at 10:35, Alex Herbert <alex.d.herbert@gmail.com> wrote:
>>
>>> This interface has been added in v1.3:
>>>
>>> /**
>>>   * Applies to samplers that can share state between instances. Samplers
>>> can be created with a
>>>   * new source of randomness that sample from the same state.
>>>   *
>>>   * @param <R> Type of the sampler.
>>>   * @since 1.3
>>>   */
>>> public interface SharedStateSampler<R> {
>>>      /**
>>>       * Create a new instance of the sampler with the same underlying state
>>> using the given
>>>       * uniform random provider as the source of randomness.
>>>       *
>>>       * @param rng Generator of uniformly distributed random numbers.
>>>       * @return the sampler
>>>       */
>>>      R withUniformRandomProvider(UniformRandomProvider rng);
>>> }
>>>
>>> All of the DiscreteSampler and ContinuousSampler classes have been updated
>>> to implement SharedStateSampler.
>>>
>>> This is the equivalent of:
>>>
>>> /**
>>>   * Sampler that generates values of type {@code int} and can create new
>>> instances to sample
>>>   * from the same state with a given source of randomness.
>>>   *
>>>   * @since 1.3
>>>   */
>>> public interface SharedStateDiscreteSampler
>>>      extends DiscreteSampler,
>>> SharedStateSampler<SharedStateDiscreteSampler> {
>>>      // Marker interface
>>> }
>>>
>>>
>>> If this combined marker interface is added to the library it would
>>> simplify a lot of the code for samplers which have internal delegates to
>>> avoid casting. With this interface they can hold a
>>> SharedStateDiscreteSampler rather than a DiscreteSampler delegate and
>>> directly use it to create new delegates.
>>>
>>> For example PoissonSampler code which wraps a specific implementation:
>>>
>>> /**
>>>   * @param rng Generator of uniformly distributed random numbers.
>>>   * @param source Source to copy.
>>>   */
>>> private PoissonSampler(UniformRandomProvider rng,
>>>          PoissonSampler source) {
>>>      super(null);
>>>
>>>      if (source.poissonSamplerDelegate instanceof SmallMeanPoissonSampler) {
>>>          poissonSamplerDelegate =
>>>
>>> ((SmallMeanPoissonSampler)source.poissonSamplerDelegate).withUniformRandomProvider(rng);
>>>      } else {
>>>          poissonSamplerDelegate =
>>>
>>> ((LargeMeanPoissonSampler)source.poissonSamplerDelegate).withUniformRandomProvider(rng);
>>>      }
>>> }
>>>
>>> Becomes:
>>>
>>> private PoissonSampler(UniformRandomProvider rng,
>>>          PoissonSampler source) {
>>>      super(null);
>>>      poissonSamplerDelegate =
>>> source.poissonSamplerDelegate.withUniformRandomProvider(rng);
>>> }
>>>
>>>
>>> I propose to add:
>>>
>>> public interface SharedStateDiscreteSampler
>>>      extends DiscreteSampler,
>>> SharedStateSampler<SharedStateDiscreteSampler> {
>>>      // Marker interface
>>> }
>>>
>>> public interface SharedStateContinuousSampler
>>>      extends ContinuousSampler,
>>> SharedStateSampler<SharedStateContinuousSampler> {
>>>      // Marker interface
>>> }
>>>
>>>
>>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message