[ https://issues.apache.org/jira/browse/RNG-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16819878#comment-16819878 ]
Alex D Herbert commented on RNG-91:
-----------------------------------
Original timings above were using the Kemp method that effectively computes the cumulative probability of the Poisson distribution until the cumulative probability is above the value of the random deviate.
The cumulative probability is computed *for each sample*. This is a lot of repeat computation.
However it should be noted that the value of the random deviate will be above 0.5 half of the time. Therefore for repeat sampling it is worth computing and storing the cumulative probability and the sample value that corresponds to approximately 0.5 in the cumulative probability function. This will be known as the 50% hedge value.
This will *save half of the computation* for *half of the samples*. This is expected to produce a speed increase of 50% and so *reduce runtime to 2/3* of that listed above.
I have tested an implementation that uses this 50% hedge.
Here is an output of the raw data for various means using the original Kemp algorithm and the modified Kemp algorithm. The algorithm is compared to the current Small and Large mean Poisson sampler.
Note that the XO_RO_SHI_RO_128_PLUS generator is the fastest currently in the library for double floating point generation. The Small mean sampler is very dependent on the speed of the generator so a slow algorithm is once again included to demonstrate this (WELL_44497B).
| |Small| | |OriginalKemp| | |Kemp| | |Large| | |
|mean|Xo|Split|Well|Xo|Split|Well|Xo|Split|Well|Xo|Split|Well|
|0.25|8.63|10.03|30.14|7.87|9.07|22.16|8.13|9.04|22.04| | | |
|0.5|12.31|14.60|37.72|11.34|13.07|25.42|11.06|12.39|25.09| | | |
|1|16.71|19.92|49.53|15.20|17.25|29.59|13.58|15.41|28.04|91.78|100.64|203.57|
|2|21.37|23.14|67.82|20.64|21.00|36.40|16.73|18.70|31.38|88.51|97.72|193.22|
|4|25.94|28.17|101.93|26.91|27.22|43.46|21.52|23.51|39.24|81.67|88.89|172.74|
|8|35.49|37.44|167.94|44.26|44.58|60.72|33.04|36.91|49.21|72.96|79.27|145.84|
|16|52.61|55.00|300.23|78.55|79.28|94.81|53.58|55.03|68.86|64.89|69.99|134.86|
|32|87.17|90.55|562.73|147.19|147.15|162.92|90.86|92.52|107.10|59.94|62.67|146.87|
Sorry for the large table width but the raw data allows a lot of analysis to be done. Algorithms are:
* Split = SPLIT_MIX_64
* Xo = XO_RO_SHI_RO_128_PLUS
* Well = WELL_44497_B
The LargeMean sampler is invalid at a mean below 1.
*Analysis*
The new hedge computation is very similar to the original Kemp sampler at very low mean. The added complexity of the algorithm cancels the speed increase of the 50% hedge. It should also be noted that at a mean of 0.25 the cumulative probability of a Poisson distribution is 0.779 for n=0. This high skew in the distribution effectively means the hedge is rarely used and the complexity of the algorithm is a waste.
When the mean is large the new 50% hedge algorithm does show the approximate run-time at 2/3 of the original Kemp algorithm. So the hedge algorithm is working as expected,
When the mean is in the range 0.5 to 8 the new hedged Kemp sampler is the fastest.
Only at mean 16 and 32 is the current Small mean sampler faster and this is by less then 5% and requires a very fast generator. When the generator is slow the new Kemp sampler is faster by 5-fold due to the small sampler requiring multiple double values from the RNG per poisson sample.
The Large mean sampler is currently used in the library at mean 40. This table shows that with a fast RNG the sampler starts outperforming the small samplers around a mean of 32. When the sampler is slow it outperforms the small mean sampler at mean 8 but is still slower than the new Kemp sampler when the mean is 32.
Note: The bound of 40 on the large mean poisson sampler may not be based on performance and instead on the validity of the algorithm. It is based around approximating the poisson distribution using a normal distribution with handling of the skew when the mean is low. This sampler may not even be valid for very low means and the current units tests use a mean of 67.89 to test the distribution.
This is perhaps best illustrated with a chart (this omits the original Kemp sampler and only includes the new Kemp sampler with the 50% hedge):
!poisson-samplers.jpg!
The new code for the Kemp sampler and the performance test is submitted in a PR.
A recommendation would be to:
* Switch the code to use the Kemp sampler instead of the Small mean poisson sampler. Speed is approximately equivalent with the fastest generators in the library and is much better when the generator is slow as only 1 double from the RNG is used per sample.
* Maintain the switch point of 40 for the change to the large mean poisson sampler. This is a valid point on the basis of performance and may be imposed anyway by the details of the algorithm
> Kemp small mean poisson sampler
> -------------------------------
>
> Key: RNG-91
> URL: https://issues.apache.org/jira/browse/RNG-91
> Project: Commons RNG
> Issue Type: New Feature
> Components: sampling
> Affects Versions: 1.3
> Reporter: Alex D Herbert
> Assignee: Alex D Herbert
> Priority: Minor
> Fix For: 1.3
>
> Attachments: kemp.jpg, poisson-samplers.jpg
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> The current algorithm for the {{SmallMeanPoissonSampler}} is used to generate Poisson samples for any mean up to 40. The sampler requires approximately n samples from a RNG to generate 1 Poisson deviate for a mean of n.
> The [Kemp (1981)|https://www.jstor.org/stable/2346348] algorithm requires 1 sample from the RNG and then accrues a cumulative probability using a recurrence relation to compute each successive Poisson probability:
> {noformat}
> p(n+1) = p(n) * mean / (n+1)
> {noformat}
> The full algorithm is here:
> {code:java}
> mean = ...;
> final double p0 = Math.exp(-mean);
> @Override
> public int sample() {
> double u = rng.nextDouble();
> int x = 0;
> double p = p0;
> // The algorithm listed in Kemp (1981) does not check that the rolling probability
> // is positive. This check is added to ensure no errors when the limit of the summation
> // 1 - sum(p(x)) is above 0 due to cumulative error in floating point arithmetic.
> while (u > p && p != 0) {
> u -= p;
> x++;
> p = p * mean / x;
> }
> return x;
> }
> {code}
> The limit for the sampler is set by the ability to compute p0. This is approximately 744.440 when Math.exp(-mean) returns 0.
> A conservative limit of 700 sets an initial probability p0 of 9.85967654375977E-305. When run through the summation series for the limit (u initialised to 1) the result when the summation ends (p is zero) leaves u = 3.335439283623915E-15. This is close to the expected tolerance for floating point error (Note: 1 - Math.nextDown(1) = 1.1102230246251565E-16).
> Using a mean of 10 leaves u = 4.988586742717954E-17. So smaller means have a similar error. The error in the cumulative sum is expected to result in truncation of the long tail of the Poisson distribution (which should be bounded at infinity).
> This sampler should outperform the current {{SmallMeanPoissonSampler}} as it requires 1 uniform deviate per sample.
> Note that the \{[SmallMeanPoissonSampler}} uses a limit for the mean of Integer.MAX_VALUE / 2. This should be updated since it also relies on p0 being above zero.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)