On 10/05/2019 15:07, Gilles Sadowski wrote:
> Hi.
>
> Le ven. 10 mai 2019 à 15:53, Alex Herbert a écrit :
>>
>> On 10/05/2019 14:27, Gilles Sadowski wrote:
>>> Hi Alex.
>>>
>>> Le ven. 10 mai 2019 à 13:57, Alex Herbert a écrit :
>>>> Can I get a review of the PR for RNG-101 please.
>>> Thanks for this work!
>>>
>>> I didn't go into the details; however, I see many fields and methods like
>>> table1 ... table5
>>> fillTable1 ... fillTable5
>>> getTable1 ... getTable5
>>> Wouldn't it be possible to use a 2D table:
>>> table[5][];
>>> so that e.g. only one "fillTable(int tableIndex, /* other args */)" method
>>> is necessary (where "tableIndex" runs from 0 to 4)?
>> Yes. The design is based around using 5 tables as per the example code.
>>
>> The sample() method knows which table it needs so it can directly jump
>> to the table in question. I'd have to look at the difference in speed
>> when using a 2D table as you are adding another array access but
>> reducing the number of possible method calls (although you still need a
>> method call). Maybe this will be optimised out by the JVM.
>>
>> If the speed is not a factor then I'll rewrite it. Otherwise it's
>> probably better done for speed as this is the entire point of the
>> sampler given it disregards any probability under 2^-31 (i.e. it's not a
>> perfectly fair sampler).
>>
>> Note that 5 tables are needed for 5 hex digits (base 2^6). The paper
>> states using 3 tables of base 2^10 then you get a speed increase
>> (roughly 1.16x) at the cost of storage (roughly 9x). Changing to 2
>> tables of base 2^15 does not make it much faster again.
>>
>> I'll have a rethink to see if I can make the design work for different
>> base sizes.
> That could be an extension made easier with the 2D table, but
> I quite agree that given the relatively minor speed improvement
> to be expected, it is not the main reason; the rationale was just to
> make the code a more compact and a little easier to grasp (IMHO).
>
> Gilles
Benchmark (randomSourceName) (samplerType) Mode Cnt Score Error Units
DiscreteSamplersPerformance.baseline N/A N/A avgt 5 2.043 ± 0.015 ns/op
DiscreteSamplersPerformance.nextInt SPLIT_MIX_64 N/A avgt 5 3.577 ± 0.028 ns/op
DiscreteSamplersPerformance.sample SPLIT_MIX_64 MarsagliaTsangWangDiscreteSampler avgt 5 5.550 ± 0.019 ns/op
DiscreteSamplersPerformance.sample SPLIT_MIX_64 MarsagliaTsangWangDiscreteSampler2 avgt 5 5.974 ± 0.073 ns/op
DiscreteSamplersPerformance.sample SPLIT_MIX_64 MarsagliaTsangWangSmallMeanPoissonSampler avgt 5 8.104 ± 0.048 ns/op
DiscreteSamplersPerformance.sample SPLIT_MIX_64 MarsagliaTsangWangSmallMeanPoissonSampler2 avgt 5 8.217 ± 0.015 ns/op
DiscreteSamplersPerformance.sample SPLIT_MIX_64 MarsagliaTsangWangBinomialSampler avgt 5 8.321 ± 0.028 ns/op
DiscreteSamplersPerformance.sample SPLIT_MIX_64 MarsagliaTsangWangBinomialSampler2 avgt 5 9.277 ± 0.167 ns/op
The Poisson(mean = 22.9) sampler has a small difference:
8.104 - 3.577 = 4.527
8.217 - 3.577 = 4.64
About 2.4% slower.
But the Binomial (n=67,p=0.33) sampler is a fair bit slower:
8.321 - 3.577 = 4.744
9.277 - 3.577 = 5.7
About 20% slower.
So it seems that the JVM cannot optimise the 2D table look-up. It may
well be due to the use of an interface to support different table
storage types:
table.get(0, n)
If working direct with the array then:
array[0][n]
may be optimised away to just the second index access.
As it is the following:
table.get0(n)
table.get1(n)
...
is faster than
table.get(0, n)
table.get(1, n)
...
I have to admit that the code with the 2D table is nice and compact. But
thinking about the implementation it can still support 2, 3, or 5 tables
with the current approach. The later tables would just be empty.
Here are some results for a quick hack of a 3 table version (the suffix
10 is for base 10):
Benchmark (randomSourceName) (samplerType) Mode Cnt Score Error Units
DiscreteSamplersPerformance.baseline N/A N/A avgt 5 2.042 ± 0.008 ns/op
DiscreteSamplersPerformance.nextInt SPLIT_MIX_64 N/A avgt 5 3.577 ± 0.025 ns/op
DiscreteSamplersPerformance.sample SPLIT_MIX_64 MarsagliaTsangWangDiscreteSampler avgt 5 6.087 ± 2.804 ns/op
DiscreteSamplersPerformance.sample SPLIT_MIX_64 MarsagliaTsangWangDiscreteSampler2 avgt 5 6.002 ± 0.141 ns/op
DiscreteSamplersPerformance.sample SPLIT_MIX_64 MarsagliaTsangWangDiscreteSampler10 avgt 5 5.301 ± 0.015 ns/op
DiscreteSamplersPerformance.sample SPLIT_MIX_64 MarsagliaTsangWangDiscreteSampler102 avgt 5 5.715 ± 0.007 ns/op
There's a timing anomaly for the original
MarsagliaTsangWangDiscreteSampler. The median is 5.76. Subtracting the
baseline and computing the relative performance makes the original
version 26% slower when using 5 tables rather than 3, the 2D version is
13% slower as the cost of the 2D look-up is present in both and the
reduced number of tables has less of an effect.
I will create a version that can use either 3 or 5 tables and avoid the
2D table. Then investigate the actual table storage requirements. It may
be better to use 5 tables by default and allow an override in the
constructor for 3 tables.
Alex
>
>>> The diff for "DiscreteSamplersList.java" refers to
>>> MarsagliaTsangWangBinomialSampler
>>> but
>>> MarsagliaTsangWangSmallMeanPoissonSampler
>>> seems to be missing.
>> Oops, I missed adding that back. I built the PR from code where I was
>> testing lots of implementations.
>>
>> I've just added it back and it is still passing locally. Travis should
>> see that too as I pushed the change to the PR.
>>
>>> Regards,
>>> Gilles
>>>
>>>> This is a new sampler based on the source code from the paper:
>>>>
>>>> George Marsaglia, Wai Wan Tsang, Jingbo Wang (2004)
>>>> Fast Generation of Discrete Random Variables.
>>>> Journal of Statistical Software. Vol. 11, Issue. 3, pp. 1-11.
>>>>
>>>> https://www.jstatsoft.org/article/view/v011i03
>>>>
>>>> The code has no explicit licence.
>>>>
>>>> The paper states:
>>>>
>>>> "We have provided C versions of the two methods described here, for
>>>> inclusion in the “Browse
>>>> files”section of the journal. ... You may then want to examine the
>>>> components of the two files, for illumination
>>>> or for extracting portions that might be usefully applied to your
>>>> discrete distributions."
>>>>
>>>> So I assuming that it can be incorporated with little modification.
>>>>
>>>> The Java implementation has been rewritten to allow the storage to be
>>>> optimised for the required size. The generation of the tables has been
>>>> adapted appropriately and checks have been added on the input parameters
>>>> to ensure the sampler does not generate exceptions once constructed (I
>>>> found out the hard way that the original code was not entirely correct).
>>>>
>>>> Thanks.
>>>>
>>>> Alex
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>