commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles Sadowski <gillese...@gmail.com>
Subject Re: [Rng] New XoShiRo generators
Date Wed, 06 Mar 2019 22:57:48 GMT
Le mer. 6 mars 2019 à 23:07, Alex Herbert <alex.d.herbert@gmail.com> a écrit :>
> >
> > On 6 Mar 2019, at 21:42, Alex Herbert <alex.d.herbert@gmail.com> wrote:
> >
> >
> >
> >> On 6 Mar 2019, at 21:24, Gilles Sadowski <gilleseran@gmail.com> wrote:
> >>
> >> Hello.
> >>
> >> Le mer. 6 mars 2019 à 21:49, Alex Herbert <alex.d.herbert@gmail.com>
a écrit :
> >>>
> >>>
> >>>
> >>>> On 6 Mar 2019, at 17:11, Gilles Sadowski <gilleseran@gmail.com>
wrote:
> >>>>
> >>>> Do the two variants produce uncorrelated sequences?
> >>>
> >>> I will test this when I branch a new PR for just this code.
> >>
> >> IMHO, it's strange that there would be 2 sources of randomness in a single
> >> implementation.
> >> Concretely: If one needs a fast "int" provider, and a fast "long" provider,
I'd
> >> consider the simpler solution of using 2 different providers.
> >
> > I think this has crossed wires somewhere. I was talking about the variant of the
XorShift1024Star algorithm and whether XorShift1024Star should be deprecated in favour of
XorShift1024StarPhi.

In the above quote, I was referring to the override of "nextInt()" in
"SplitMix64".

> >
> > The variant of the SplitMix64 algorithm for producing ints was tested in a benchmark
that I am prepared to throw away. The results are in the Jira ticket. The way the SplittableRandom
creates an int is slightly slower than the method used in [RNG] SplitMix64 which divides the
long in half.

I'm not sure I understand correctly, but I'd not aim at copying how
"SplittableRandom" produces each type of sequences if it is at odd
with the component's design: one source (either "int" or "long"), and
one way to generate each of the other types.

> This ticket can be closed as done and I’ll add a comment that no speed improvement
was found.
> >
> > I agree that this variant algorithm should have been in a new provider.

So, we agree.

> It would produce a different output of bytes since the bit shift in the second step is
different. But I’m not going to add this algorithm so it does not matter.

OK.

> >
> > However I will test if XorShift1024Star and XorShift1024StarPhi are correlated just
for completeness.
> >
>
> Did a test of 100 repeats of a correlation of 50 longs from the XorShift1024Star and
XorShift1024StarPhi, new seed each time:
>
> SummaryStatistics:
> n: 100
> min: -0.30893547071559685
> max: 0.37616626218398586
> sum: 3.300079237520435
> mean: 0.033000792375204355
> geometric mean: NaN
> variance: 0.022258533475114764
> population variance: 0.022035948140363616
> second moment: 2.2035948140363617
> sum of squares: 2.312500043775496
> standard deviation: 0.14919294043323486
> sum of logs: NaN
>
> Note that the algorithm is the same except the final step when the multiplier is used
to scale the final output long:
>
>    return state[index] * multiplier;
>
> So if it was outputting a double the correlation would be 1. But it is a long generator
so the long arithmetic wraps to negative on large multiplications. The result is that the
mean correlation is close to 0.
>
> A single repeat using 1,000,000 numbers has a correlation of 0.002.
>
> Am I missing something here with this type of test?

I'm afraid I don't follow: If the state is the same then I'd assume that
the two generators are the same (i.e. totally correlated).

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message