commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles Sadowski <gillese...@gmail.com>
Subject Re: [rng] Split and Jump functions
Date Wed, 01 May 2019 15:05:42 GMT
Hi.

Le mar. 30 avr. 2019 à 17:08, Alex Herbert <alex.d.herbert@gmail.com> a écrit :
>
> On 29/04/2019 22:14, Gilles Sadowski wrote:
> > Hello.
> >
> > Le lun. 29 avr. 2019 à 19:09, Alex Herbert <alex.d.herbert@gmail.com> a écrit
:
> >> On 28/04/2019 19:11, Gilles Sadowski wrote:
> >>> Le dim. 28 avr. 2019 à 17:02, Alex Herbert <alex.d.herbert@gmail.com>
a écrit :
> >>>>
> >>>>> On 28 Apr 2019, at 00:59, Bernd Eckenfels <ecki@zusammenkunft.net>
wrote:
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> Just a question, I am unclear on the terminology, is „jump“
(did I miss the discussion leading toot?) something invented here? It sounds to me like this
is a generator where the state can be cloned and it is „seekable“. It probably makes sense
to have those two dimensions separated anyway.
> >>>> Hi Bernd, thanks for the input.
> >>>>
> >>>> This thread started with the definition:
> >>>> Jump:
> >>>>
> >>>> To create a new instance of the generator that is deterministically
based on the state of the current instance but advanced a set number of iterations.
> >>>>
> >>>>
> >>>> However it is not required to create a new instance at the same time
as jumping. You are correct in that this is two functionalities:
> >>>>
> >>>> 1. Jump forward in the sequence
> >>>> 2. Copy
> >>>>
> >>>> However the two are coupled. Having jump on its own is useless (why
move forward in the sequence without using it?). So a copy needs to be created somewhere before/after
the jump.
> >>>>
> >>>> The idea of a jump is to create a series of the generator at different
points in the state. The generators can be used for parallel computations and will be ensured
to not overlap in their output sequence for number of outputs skipped by the jump length.
> >>>>
> >>>> FYI. The generators that support this have jump sizes of 2^64, 96, 128,
192, 256 and 512. So this is a lot of output sequence to jump.
> >>>>
> >>>> Copy on its own works but for what purpose? If you want a second generator
at the moment you just create a new one (with a different seed). Duplicate copies of generators
is prone to potential pitfalls where simulations are not as random as you intend. For a special
use case where you wish to run multiple simulations with the same generator you can use the
Restorable interface to save the state of one and re-create it in other instances.
> >>>>
> >>>> The current thread came to the choice of:
> >>>>
> >>>>>>> So the options are (in all cases returning the copy):
> >>>>>>>
> >>>>>>> 1. createAndJumpCopy
> >>>>>>> 2. copyAndJumpParent
> >>>>>>> 3. jumpParentAndCopy
> >>>>>>> 4. jump and copy separately
> >>>> Jump and copy separately was ruled out to discourage misuse of copy.
> >>>>
> >>>> The current suggestion is 1. Create a copy and jump that ahead. The
current instance is not affected.
> >>>>
> >>>> I now consider this to be weaker for a variety of use cases than 2.
This copies the current state for use and then jumps the parent ahead. So this alters the
state of the parent generator.
> >>>>
> >>>> Note that all other methods of a generator alter its state. So having
jump alter its state is reasonable.
> >>>>
> >>>> The most flexible API is to separate jump and copy into two methods.
We can still support helper functions that take in a Jumpable generator and create a jump
series of generators for parallel work. Separating jump and copy allows the functionality
to be used in a larger number of ways than any other interface that attempts to combine jump
and copy.
> >>>>
> >>>> I am fine with having separate jump and copy. If so the copy method,
being part of the Jumpable interface, will be functionally coupled with the jump method and
should be described in Javadoc with the intended purpose to use it to copy the parent state
either before or after a jump into a child generator.
> >>>>
> >>>> As a precursor this API is very flexible:
> >>>>
> >>>> JumpableUniformRandomProvider extends UniformRandomProvider {
> >>>>       /** Jump and return same instance. */
> >>>>       JumpableUniformRandomProvider jump();
> >>>>       /** Copy the instance. */
> >>>>       JumpableUniformRandomProvider copy();
> >>>> }
> >>>>
> >>>> Returning the same instance in jump() allows method chaining such as
either:
> >>>>
> >>>> rng.jump().copy();
> >>>> rng.copy().jump();
> >>>>
> >>>> This potential pitfall is that a user may do this:
> >>>>
> >>>> UniformRandomProvider rng1 = rng.copy().jump();
> >>>> UniformRandomProvider rng2 = rng.copy().jump();
> >>>>
> >>>> Where rng1 and 2 will be the same, 1 jump ahead of the parent state.
Or:
> >>>>
> >>>> UniformRandomProvider rng1 = rng.jump();
> >>>> UniformRandomProvider rng2 = rng.jump();
> >>>>
> >>>> Where rng, rng1 and rng2 are the same instance all 2 jumps ahead of
the start point.
> >>>>
> >>>> I think our purpose is to provide an API for the generators that can
jump and that is not too restrictive given the use cases we have so far thought up. There
may be other ideas of use cases that cannot be done with the coupled functionality of:
> >>>>
> >>>> JumpableUniformRandomProvider extends UniformRandomProvider {
> >>>>       /** Copy the instance, then jump ahead. Return the copy of the
previous state. */
> >>>>       JumpableUniformRandomProvider jump();
> >>>> }
> >>>>
> >>>> JumpableUniformRandomProvider extends UniformRandomProvider {
> >>>>       /** Copy the instance, then jump the copy ahead. Return the copy.
The current instance is not affected. */
> >>>>       JumpableUniformRandomProvider jump();
> >>>> }
> >>>>
> >>>> So the split functions without allowing method chaining:
> >>>>
> >>>> JumpableUniformRandomProvider extends UniformRandomProvider {
> >>>>       /** Jump the current instance ahead. */
> >>>>       void jump();
> >>>>       /** Copy the instance. This is intended to be used either before
or after a call to jump()
> >>>>        * to create a series of generators. */
> >>>>       JumpableUniformRandomProvider copy();
> >>>> }
> >>> As you indicated above, there is no advantage in having separate
> >>> "jump()" and "copy()",
> >>> as counter-intuitive it may look at first sight.
> >> So is the favoured approach now to make the jump method return a copy of
> >> the current state and then increment the state of the current instance
> >> with a jump, as in:
> >>
> >> JumpableUniformRandomProvider extends UniformRandomProvider {
> >>       /** Copy the instance, then jump ahead. Return the copy of the previous
state. */
> >>       JumpableUniformRandomProvider jump();
> >> }
> >>
> >> This would appear to solve more use cases than returning a copy that is
> >> further ahead. For example the dynamic generation of a jump sequence of
> >> an unknown size:
> >>
> >> List<UniformRandomProvider> rngs = ...;
> >> JumpableUniformRandomProvider rng;
> >> for (Data data : Iterable<Data>) {
> >>       rngs.add(rng.jump());
> >> }
> >> // Do something with all the Data and rngs
> > I confirm agreement. ;-)
> OK. I'll modify the Jira tickets to match.
> >
> >> I note that if this is the functionality then the jump() method can be
> >> written to return a UniformRandomProvider:
> >>
> >> JumpableUniformRandomProvider extends UniformRandomProvider {
> >>       /** Copy the instance, then jump ahead. Return the copy of the previous
state. */
> >>       UniformRandomProvider jump();
> >> }
> >>
> >> It would not have to be enforced to remove jump functionality from the
> >> copy but should make it clear that the returned object is not to be
> >> jumped again.
> > Yes, but a cast could work around the artificial limitation of
> > the return type, while "restrict" prevents misuse.
>
> My point was which of the following do we want the interface to support:
>
> 1. Return JumpableUniformRandomProvider (i.e. a direct copy)
>
> 2. Return UniformRandomProvider (this could be a copy that could be cast
> to Jumpable)
>
> 3. Really return a UniformRandomProvider by enforcing the removal of the
> jump function
>
> For option 3 we will have to provide a helper method somewhere to do
> this. Implementations should then use this to wrap a return copy.
>
> However removal of the jump function by restriction to just
> UniformRandomProvider will also remove support for
> RestorableUniformRandomProvider, and any other interface the
> implementation is providing.
>
> I am thinking that option 2 is OK. It would not restrict the returned
> copy but subtly suggests that the copy is not to be used for jumping.

OK. [Assuming that "jumping" main (only?) case is for a user to speed
up computations, so that it is his responsiblity to not misuse (e.g. "jump"
again) the returned instance.]

> We can enforce actual restriction in the helper method to be added to
> RandomSource to create a jump series of RNGs from a Jumpable one. But
> even then it should be optional (which you previously suggested) since
> downstream consumers of the RNG may want Restore (or Jump) functionality.

No, when getting a "restricted" instance, downstream consumers should
have no way to override it, especially in cases where they could mess
other parts of the code.

> Consider that this could be solved by a smart restriction method that
> takes an EnumSet of the interfaces that should be exposed. We then
> support combinations of the various interfaces. So far we have:
>
> UniformRandomProvider (always)
>
> RestorableUniformRandomProvider (2 methods)
>
> JumpableUniformRandomProvider (1 method)
>
> LongJumpableUniformRandomProvider (1 method)
>
> The total number of combinations of the 3 optional interfaces is 2^3 ==
> 8 (the 3rd row from Pascal's triangle). Although since you can't be
> LongJumpable and not Jumpable this drops to 6. Anyway it will get out of
> hand if more interfaces are added. The point being choosing what to
> restrict to has many options.
>
> My suggestion would be to not enforce restriction in the interface for
> jumps but make the return value the next level up the hierarchy:
>
> longJump returns JumpableUniformRandomProvider
>
> jump returns UniformRandomProvider

Yes, that seems interesting from a usage POV.

> Then in the helper method to create a jump series the returned array can
> be the same type (UniformRandomProvider or
> JumpableUniformRandomProvider) and the method has an option to enforce
> this interface by wrapping.

But IIUC, the now preferred implementation make the convenience
method not necessary (?).

Best,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message