commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles Sadowski <gillese...@gmail.com>
Subject Re: [Rng] New XoShiRo generators
Date Mon, 18 Mar 2019 10:20:56 GMT
Le dim. 17 mars 2019 à 01:01, Alex Herbert <alex.d.herbert@gmail.com> a écrit :
>
>
>
> > On 16 Mar 2019, at 23:10, Alex Herbert <alex.d.herbert@gmail.com> wrote:
> >
> >
> >
> >> On 16 Mar 2019, at 02:54, Gilles Sadowski <gilleseran@gmail.com <mailto:gilleseran@gmail.com>>
wrote:
> >>> This is read by dieharder which directly reads from stdin. This worked to
collect all the generated bits and the serial and xor composites failed the test suite.
> >>>
> >>> It is also read by the stdin2testu01.c program to pass to TestU01.
> >>>
> >>> What is happening is that the stdin2testu01.c is reading 64-bits using an
unsigned long.
> >>
> >> I don't remember why I wrote that, but as you pointed outit now looks
> >> like a plain bug.
> >
> > It may be more complicated again...
> >
> > I’ve had a play around with the data being pushed through to the testU01 library
using the c bridge. I wanted to check that the int value that is generated by the RNG is passed
through to the c program. So I wrote a simple BridgeTester class to do this. It writes all
the int values to a data file (for reference) then passes them to the c executable with the
same method as the RandomStressTester. I then modified the stdin2testu01.c program to have
an extra hidden debug mode where all the data is just written to stdout.
> >
> > I found the data file written from Java did not match the data that the c program
had. I bit more digging found that the problem was that Java uses a big endian representation
and the c program is little endian. This is true on my linux and Mac OSX platforms. So the
raw bytes read from stdin are in the wrong order.
> >
> > When I updated the program to self detect endianness and swap the byte order of
each set of 4 bytes from the stdin then the data in the c program matched the original.
> >
> > Since it was non destructive to the module I added all this to master. You can see
this working by rebuilding the c bridge and running the new profile to test it:
> >
> > > cd commons-rng-examples/examples-stress
> > > gcc src/main/c/stdin2testu01.c -o stdin2testu01 -ltestu01 -ltestu01probdist
-ltestu01mylib -lm
> > > mvn test -P bridge
> >
> > You should see two files:
> >
> > target/bridge.data
> > target/bridge.out
> >
> > These should have the same contents. The .data file is written by the java program,
and the .out file is the stdout captured from the c program with its view of the data.
> >
> > This should fix running TestU01.
> >
> > BUT I’ve not had time to determine how Dieharder is reading the stdin. Given it
is a c library it may be reading it using little endian as well. I’ll look into that next.
> >
> > Composite update:
> >
> > For some reason all my BigCrush simulations crashed. It could be a RAM issue. The
runs did take longer than expected but I did not monitor memory usage. I’ve started them
again but using only the serial composite. I think the xor one is really broken.
> >
> > FYI. Using the new bridge code with 3 runs of SmallCrush finds [6, 6, 6] / 15 failed
tested for the serial composite and [9, 9, 10] / 15 for the xor composite.
> >
> > I’m expecting BigCrush to fail a lot. I’m now more interested in seeing if it
will complete.
> >
> > Alex
> >
>
>
> PS. Thinking about the endianness it might not matter. The test suite ideally will be
able to detect if the bits are not random in the lower or upper most significant byte of the
32 bits. I.e. it should always find a problem. I am not clear if this is the case. I have
read that some generators can pass BigCrush but fail if the bits are reversed (not the bytes
but the bits). I’m happy to think that endianness is not an issue.
>
> It was a good exercise in debugging if the bridge was working though.
>
> One actual issue is that we are testing long providers using the long to create 2 int
values. Should we test using a series of the upper 32 bits and then a series of the lower
32 bits?

Is that useful since the test now sees the integers as they are produced (i.e. 2
values per long)?

Gilles

> I may set an unused workstation on this task to see what happens.
>
> Alex

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message