commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Herbert <alex.d.herb...@gmail.com>
Subject [rng] stress test results
Date Thu, 16 May 2019 10:06:42 GMT
I have run the stress test using the new application. The new application has two major changes
over the previous application:

1. It detects the platform byte-order and sends the bits in the correct order to be read by
a C application
2. The bridge to TestU01 has been updated to use all the input int values, previously it was
using every other int value

So we can expect differences from both test suites Dieharder and TestU01 BigCrush.

For reference here are the old results (from the user guide, reordered to the RandomSource
enum order):

RNG                  	Dieharder	TestU01 (BigCrush)
JDK                     11, 12, 13      74, 72, 75
WELL_512_A              0, 0, 0         7, 6, 6
WELL_1024_A             0, 0, 0         4, 4, 5
WELL_19937_A            0, 0, 0         3, 2, 3
WELL_19937_C            0, 1, 0         2, 2, 3
WELL_44497_A            0, 0, 0         2, 3, 3
WELL_44497_B            0, 0, 0         2, 2, 2
MT                      0, 1, 0         3, 2, 2
ISAAC                   0, 0, 1         0, 1, 0
SPLIT_MIX_64            0, 0, 0         2, 0, 0
XOR_SHIFT_1024_S        0, 0, 0         2, 0, 0
TWO_CMRES               1, 1, 1         0, 0, 1
MT_64                   0, 0, 1         3, 2, 3
MWC_256                 0, 0, 0         0, 0, 0
KISS                    0, 0, 0         1, 2, 0

Here are the new results:

RNG                  	Dieharder	TestU01 (BigCrush)
JDK                  	4,4,4,4,4	74,72,74,73,74    
WELL_512_A           	0,0,0,0,0	7,6,6,6,6         
WELL_1024_A          	0,0,0,0,0	4,4,5,4,4         
WELL_19937_A         	0,1,0,0,1	3,3,2,2,2         
WELL_19937_C         	0,0,0,0,0	2,2,3,2,2         
WELL_44497_A         	0,0,0,0,0	2,2,2,2,3         
WELL_44497_B         	0,0,0,0,0	2,3,2,2,2         
MT                   	0,0,0,0,0	2,3,2,2,2         
ISAAC                	0,0,0,0,0	0,1,2,0,0         
SPLIT_MIX_64         	0,0,0,0,0	1,0,0,0,0         
XOR_SHIFT_1024_S     	0,0,0,0,0	0,0,0,0,0         
TWO_CMRES            	2,2,2,2,2	4,3,3,5,4         
MT_64                	0,0,0,0,0	2,3,2,2,2         
MWC_256              	0,1,0,0,0	0,0,0,2,0         
KISS                 	0,0,0,0,0	0,0,0,0,0         
XOR_SHIFT_1024_S_PHI 	0,0,0,0,0	0,0,0,0,0         
XO_RO_SHI_RO_64_S    	0,0,0,0,0	1,1,2,1,3         
XO_RO_SHI_RO_64_SS   	0,0,0,0,0	0,0,0,0,0         
XO_SHI_RO_128_PLUS   	0,0,1,0,0	1,2,2,1,1         
XO_SHI_RO_128_SS     	0,0,0,1,0	0,1,0,0,0         
XO_RO_SHI_RO_128_PLUS	0,0,0,0,0	0,1,0,0,0         
XO_RO_SHI_RO_128_SS  	0,0,0,0,0	1,0,1,0,0         
XO_SHI_RO_256_PLUS   	0,1,0,0,0	0,0,0,0,0         
XO_SHI_RO_256_SS     	0,0,0,0,0	0,1,0,2,1         
XO_SHI_RO_512_PLUS   	0,0,0,0,1	0,0,0,2,2         
XO_SHI_RO_512_SS     	0,0,0,0,0	0,1,0,1,0

(Note: All of the single fails except one under Dieharder are for the flawed diehard_sums
test. I include it here for direct comparison with old results. I would recommend we strip
this from the new results for the user guide.)

I ran them 3 times. Then because the results were different (mainly for the JDK generator
for Dieharder) I doubled checked everything and ran another 2. Results are still the same.
Dieharder is much better for the JDK than previously. It systematically fails:

diehard_opso:0
diehard_oqso:0
diehard_dna:0
dab_bytedistrib:0

The TWO_CMRES generator is now worse as it is systematically failing:

diehard_oqso:0
diehard_dna:0

The results from BigCrush are similar for JDK and all the others except TWO_CMRES. This is
now failing a few more tests. It systematically fails:

1  SerialOver, r = 0
41  Permutation, t = 5
42  Permutation, t = 7

To check the JDK results for Dieharder I ran it 5 times using the wrong platform byte order
(i.e. what the previous test application was doing).

Old results : 11, 12, 13
New results: 11,16,14,14,15

So this matches up. If the JDK output is byte reversed it is a poor generator.

A few sources I have read indicate that BigCrush favours the upper bits of a generator. A
test should therefore run the generator bit reversed through the test application. Here are
the full forward and backward results ignoring the Diehard sums test:

RNG                  	Bit-reversed	Dieharder     	TestU01 (BigCrush)
JDK                  	false       	4,4,4,4,4     	74,72,74,73,74    
JDK                  	true        	42,42,43,49,49	35,34,35,36,36    
WELL_512_A           	false       	0,0,0,0,0     	7,6,6,6,6         
WELL_512_A           	true        	0,0,1,0,0     	7,6,6,7,6         
WELL_1024_A          	false       	0,0,0,0,0     	4,4,5,4,4         
WELL_1024_A          	true        	0,0,0,0,0     	4,4,4,4,4         
WELL_19937_A         	false       	0,1,0,0,0     	3,3,2,2,2         
WELL_19937_A         	true        	0,0,0,0,0     	3,2,2,2,3         
WELL_19937_C         	false       	0,0,0,0,0     	2,2,3,2,2         
WELL_19937_C         	true        	0,0,0,0,0     	3,2,2,3,2         
WELL_44497_A         	false       	0,0,0,0,0     	2,2,2,2,3         
WELL_44497_A         	true        	0,0,0,0,0     	3,3,3,2,2         
WELL_44497_B         	false       	0,0,0,0,0     	2,3,2,2,2         
WELL_44497_B         	true        	0,0,0,0,0     	2,2,2,2,3         
MT                   	false       	0,0,0,0,0     	2,3,2,2,2         
MT                   	true        	0,0,0,0,0     	2,2,3,3,3         
ISAAC                	false       	0,0,0,0,0     	0,1,2,0,0         
ISAAC                	true        	0,0,0,0,0     	0,0,0,0,0         
SPLIT_MIX_64         	false       	0,0,0,0,0     	1,0,0,0,0         
SPLIT_MIX_64         	true        	0,0,0,0,0     	0,1,0,0,0         
XOR_SHIFT_1024_S     	false       	0,0,0,0,0     	0,0,0,0,0         
XOR_SHIFT_1024_S     	true        	0,0,0,0,0     	0,0,1,0,0         
TWO_CMRES            	false       	2,2,2,2,2     	4,3,3,5,4         
TWO_CMRES            	true        	7,5,5,7,6     	4,3,4,4,4         
MT_64                	false       	0,0,0,0,0     	2,3,2,2,2         
MT_64                	true        	0,0,0,0,0     	2,2,2,2,2         
MWC_256              	false       	0,0,0,0,0     	0,0,0,2,0         
MWC_256              	true        	0,0,0,0,0     	1,0,0,0,0         
KISS                 	false       	0,0,0,0,0     	0,0,0,0,0         
KISS                 	true        	0,0,0,0,0     	0,0,1,0,1         
XOR_SHIFT_1024_S_PHI 	false       	0,0,0,0,0     	0,0,0,0,0         
XOR_SHIFT_1024_S_PHI 	true        	0,0,0,0,0     	0,0,2,0,0         
XO_RO_SHI_RO_64_S    	false       	0,0,0,0,0     	1,1,2,1,3         
XO_RO_SHI_RO_64_S    	true        	0,0,0,0,0     	2,2,2,2,2         
XO_RO_SHI_RO_64_SS   	false       	0,0,0,0,0     	0,0,0,0,0         
XO_RO_SHI_RO_64_SS   	true        	0,0,0,0,0     	1,0,0,0,0         
XO_SHI_RO_128_PLUS   	false       	0,0,0,0,0     	1,2,2,1,1         
XO_SHI_RO_128_PLUS   	true        	0,0,0,0,0     	2,2,2,2,2         
XO_SHI_RO_128_SS     	false       	0,0,0,0,0     	0,1,0,0,0         
XO_SHI_RO_128_SS     	true        	0,0,0,0,0     	0,0,0,0,0         
XO_RO_SHI_RO_128_PLUS	false       	0,0,0,0,0     	0,1,0,0,0         
XO_RO_SHI_RO_128_PLUS	true        	0,0,0,0,0     	2,1,1,1,2         
XO_RO_SHI_RO_128_SS  	false       	0,0,0,0,0     	1,0,1,0,0         
XO_RO_SHI_RO_128_SS  	true        	0,0,0,0,0     	0,0,2,0,0         
XO_SHI_RO_256_PLUS   	false       	0,0,0,0,0     	0,0,0,0,0         
XO_SHI_RO_256_PLUS   	true        	0,0,0,0,0     	0,0,0,0,0         
XO_SHI_RO_256_SS     	false       	0,0,0,0,0     	0,1,0,2,1         
XO_SHI_RO_256_SS     	true        	0,0,0,0,0     	0,1,1,1,2         
XO_SHI_RO_512_PLUS   	false       	0,0,0,0,0     	0,0,0,2,2         
XO_SHI_RO_512_PLUS   	true        	0,0,0,0,0     	1,0,0,0,1         
XO_SHI_RO_512_SS     	false       	0,0,0,0,0     	0,1,0,1,0         
XO_SHI_RO_512_SS     	true        	0,0,0,0,0     	0,1,1,0,0 

So bit reversed the JDK is terrible at Dieharder. It actually improves for BigCrush from terrible
to less terrible. TWO_CMRES is a bit worse when bit-reversed at Dieharder but no different
at BigCrush (it was already systematically failing 3 tests).

All the other generators have similar results when bit reversed. So adding the bit-reversed
results to the user-guide does not appear worthwhile. I will archive these and they can be
added later if required, for example to show a good generator against a bad one. This will
only be relevant if the library adds reference implementations of bad generators. Currently
only the JDK is bad generator.

Next:

I have added a ‘results' command to the stress test application that can generate these
results tables. It requires some header information not found in the old results files so
only works with the new results. It can generate the APT table directly for the user guide.
It will be useful going forward when more generators are added to update the results.

The new results are named using the test suite (dh_ or tu_), optionally the bit-reversed flag
(r_), the enum ordinal and the trial run:

dh_1_1 = Dieharder for JDK trial 1
tu_1_1 = BigCrush for JDK trial 1
dh_r_2_3 = Dieharder bit reversed for WELL_512_A trial 3

I propose to:

- Delete all the old results and add these new ones using a new directory structure. All results
can reside in a single directory.
- Ignore for now the bit-reversed results.
- Delete the old stress test code. The new code supersedes all functionality of the old version.
- Commit the new ‘results’ command when I have confirmed the APT table is correctly generated.

Questions:

1. Do we stick to using 3 trials or update to 5 (because I have the results)?
2. Do we remove the diehard_sums test result?

I would recommend removing diehard_sums. It pollutes the results for most generators with
a spurious fail that should be ignored. So I think we should ignore it.



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message