commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles Sadowski <gillese...@gmail.com>
Subject Re: [Rng] New XoShiRo generators
Date Tue, 19 Mar 2019 15:24:08 GMT
Hello.

> >>>>> [...]
> >>>> Given all the stress tests will be rerun shall I go ahead and reorder
the existing files, user guide .apt file and the GeneratorsList to be in the order of the
RandomSource enum?
> >>> We could wait for the new results before updating the site.
> >> I was going to rearrange it all and test all the links in the local site are
all ok. I have this scripted but have not yet run it.
> > Are you going to upload this script to the repository?
>
> I wasn't going to. I've put it into my branch here:
>
> https://github.com/aherbert/commons-rng/blob/userguide-rename/rename.pl

IIUC, this one is only for updating the current site (?).
I was thinking of a utility for taking the output (of several runs) of the test
suites and generate the table in "apt" format (with correct links).

> It generates two files that should do all the rearrangement. It's a work
> in progress. I've just tried it out and it seems to work, although I've
> not looked at the generated site.
>
> The 'git mv' command when viewed using 'git log -M --summary' shows the
> renames, e.g.
>
> commit c8b8903c00ab6d2c1403667048f27d9cbad4de46
> Author: aherbert <aherbert@apache.org>
> Date:   Tue Mar 19 12:14:13 2019 +0000
>
>      Updated stress test results files
>
>   rename src/site/resources/txt/userguide/stress/dh/run_1/{dh_K =>
> dh_10} (100%)
>   rename src/site/resources/txt/userguide/stress/dh/run_1/{dh_L =>
> dh_11} (100%)
>   rename src/site/resources/txt/userguide/stress/dh/run_1/{dh_M =>
> dh_12} (100%)
>   rename src/site/resources/txt/userguide/stress/dh/run_1/{dh_J =>
> dh_13} (100%)
>   rename src/site/resources/txt/userguide/stress/dh/run_1/{dh_C => dh_2}
> (100%)
>
>
> However it is probably easiest to leave it as is and have the source
> repo results files out of sync with the GeneratorsList until the next
> benchmark results are done.
>

I think so.

> >
> >> When new results are ready they can be written over the existing ones. Either
way I am fine. So let’s leave it until new results have been done and then check the site.
> >>
> >> I will update the GeneratorsList to be autogenerated from the RandomSource enum.
> > Thanks.
> > Let me know when everything is in place, and I'll try and start a stress test
> > run on my side.
>
> OK.
>
> I am currently rerunning the dieharder test for the XorShift1024Star
> composites since that requires a little endian format on my machine. So
> far there are not as many failures when the byte order is reversed.
>
> Once that is done I think we can wrap this up by:
>
> - updating the stress test to support little/big endian format as input
> for the test suite
>
> - updating the stress test GeneratorsList to match the RandomSource enum
> order
>
> - merging the modified XorShift1024StarPhi generator
>
> - deprecating the XOR_SHIFT_1024_S enum in favour of XOR_SHIFT_1024_S_PHI
>
> - merging the new XorShiRo generators
>
> Then it should be ready for a new stress test benchmark.
> >>>>
> >>>> Big/Little Endian for Dieharder:
> >>>>
> >>>> [...]
> >>>>
> >>>> Reversing the bytes in the Java code is the easiest option.
> >>> +1
> >>> [With an option flag for selecting whether the output should be BE or LE.]
> >>>
> >> OK. I will consolidate all this and update the stress_test.md instructions to
make it clear that endianness needs to be considered.
> >>
> >> Should I add the raw data dumper to the source base? This runs a named RandomSource
for a given number of iterations with a provided seed and outputs 4 files: Dieharder text
format and raw binary, with standard order and byte reversed. It may be useful if debugging
the output of RNGs ever needs to be done again.
> > Sure.  Can this be also added as an option to the "RandomStressTester"
> > class?  E.g. with a flag like
> >    --dump file_prefix,sequence_length
> > where
> >    "file_prefix" is the basename of the output files, and
> >    "sequence_length" is the number of ints to generate.
>
> The RandomStressTester uses a list of generators. I built the
> RawDataDumper to run using a maven profile where it works for a single
> named RandomSource. Arguments are:
>
> RandomSource name, long seed, sequence length, file prefix.
>
> So all the functionality is there. If the file is included in the shaded
> jar you should be able to do:
>
>  > java -cp examples-stress.jar
> org.apache.commons.rng.examples.stress.RawDataDumper SPLIT_MIX_64 123L
> 1000 splitmix.out
>
> Instead of running in a maven profile it could be built into a shaded
> package to allow:
>
>  > java -jar raw-data-dumper.jar SPLIT_MIX_64 123L 1000 splitmix.out

I like that.
One build should generate all application (dumper and stress test launcher).

> I think the functionality is better as two programs to avoid doing too
> much in the RandomStressTester. Also I do not see the need to dump
> output from a list of 15+ generators.

+1

> I can add flag arguments to be used to specify which file to write:
>
> .dh (text output using the dieharder format (uses unsigned int))
>
> .txt (text output)
>
> .raw (raw binary output)
>
> .bin (2s complement binary output)
>
> .hex (text output using hex)
>
>
> And options for:
>
> - int output
>
> - long output
>
> - byte reversed
>
> - bit reversed
>
> You then have a utility for dumping output of any random source to file
> in a variety of formats.
>
> Although long output is not needed for the test suites it is useful for
> native long generators.
>
> WDYT?

Looks good!

Thanks,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message