commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles Sadowski <gillese...@gmail.com>
Subject Re: [Rng] New XoShiRo generators
Date Tue, 19 Mar 2019 10:35:35 GMT
> > [...]
> >> So leave the testing to just ints and document on the user guide that is
> >> what we are testing.
> >
> > +1
>
> OK. That seems simplest.
>
> Given all the stress tests will be rerun shall I go ahead and reorder the existing files,
user guide .apt file and the GeneratorsList to be in the order of the RandomSource enum?

We could wait for the new results before updating the site.

>
>
> Big/Little Endian for Dieharder:
>
> I’ve spent some time looking at the source code for Dieharder. It reads binary file
data using this (taken from libdieharder/rng_file_input_raw.c):
>
> unsigned int iret;
> // ...
> fread(&iret,sizeof(uint),1,state->fp);
>
> So it reads single unsigned integers using fread().
>
> Given that it is possible to run die harder using numbers from ascii and binary input
files I set up a test. I created them using a RNG with the same seed with the standard output
from a DataOutputStream and the byte reversed output using Integer.reverseBytes. Here’s
what happens:
>
> > dieharder -g 201 -d 0 -f raw.bin.rev
>    diehard_birthdays|   0|       100|     100|0.89220858|  PASSED
> > dieharder -g 202 -d 0 -f raw.txt
>    diehard_birthdays|   0|       100|     100|0.89220858|  PASSED
>
> > dieharder -g 201 -d 0 -f raw.bin
>    diehard_birthdays|   0|       100|     100|0.30776452|  PASSED
> > dieharder -g 202 -d 0 -f raw.txt.rev
>    diehard_birthdays|   0|       100|     100|0.30776452|  PASSED
>
> > cat raw.bin | dieharder -g 200 -d 0
>    diehard_birthdays|   0|       100|     100|0.30776452|  PASSED
>
>
> Note the reversed byte sequence (.rev suffix) is required to get the same results from
the binary (.bin) file as from the text (.txt) file.
>
> So the binary read of Dieharder is using the little endian representation, as was required
for TestU01.
>
> I had modified the stdin2testu01.c bridge to detect if the system was little endian and
then correct the input data by reversing the bytes. It may be a better idea to write a test
c program to detect the endianness of the system for reference. Then update the stress test
benchmark to have an argument for little or big endian output when piping the int data to
the command line program.
>
> I think it is important to get the endianness of the data correct. At least for Dieharder
it runs tests using tuples of bits from the data which can span multiple bytes. For example
the sts_serial test (-d 102) uses overlapping n-tuples of bits with n from 1 to 16. Other
tests using non overlapping tuples such as rgb_bitdist (-d 200) use n 1 to 12.
>
> Reversing the bytes in the Java code is the easiest option.

+1
[With an option flag for selecting whether the output should be BE or LE.]

Best,
Gilles

> Others are:
>
> - Write binary data to file and then run it using that. This will end up looping the
file though and repeating the sequence unless the binary file is huge.
> - Call Dieharder from a bridge program using the libdieharder API. I’ve not checked
if there is an API method call for Dieharder to run everything.
>
> Alex

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message