commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex D Herbert (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (RNG-50) PoissonSampler single use speed improvements
Date Tue, 31 Jul 2018 17:11:00 GMT

    [ https://issues.apache.org/jira/browse/RNG-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564006#comment-16564006
] 

Alex D Herbert commented on RNG-50:
-----------------------------------

I've built a simple test using JHM based on {{org.apache.commons.rng.sampling.distribution.PoissonSamplersPerformance}}
in the commons-rng-examples-jmh project.

I note that in that class the {{Sources}} nested class returns the {{UniformRandomProvider}}
without resetting the state. For consistency I changed it to:
{code:java}
/**
 * The benchmark state (retrieve the various "RandomSource"s).
 */
@State(Scope.Benchmark)
public static class Sources {
    /**
     * RNG providers.
     */
    @Param({
            "WELL_19937_C",
            "WELL_44497_B",
            "SPLIT_MIX_64"
            // and the rest ...
        })
    private String randomSourceName;

    /** RNG. */
    private RestorableUniformRandomProvider generator;

    /**
     * The state of the generator at the start of the test (for reproducible results).
     */
    private RandomProviderState state;

    /**
     * @return the RNG.
     */
    public UniformRandomProvider getGenerator() {
        generator.restoreState(state);
        return generator;
    }

    /** Instantiates generator. */
    @Setup
    public void setup() {
        final RandomSource randomSource = RandomSource.valueOf(randomSourceName);
        generator = RandomSource.create(randomSource);
        state = generator.saveState();
    }
}
{code}
I've done a simple run with a small mean and large mean Poisson sampler and one that appropriately
wraps them. I'll repeat this later with more loops and different RNGs but here are the initial
results.
||Benchmark||Mode||Threads||Samples||Score||Score Error (99.9%)||Unit||Param: mean||Param:
randomSourceName||
|LargeMeanPoissonSampler|avgt|1|5|2009.632668|263.007827|us/op|40.3|SPLIT_MIX_64|
|LargeMeanPoissonSampler|avgt|1|5|1976.939758|40.980963|us/op|57.9|SPLIT_MIX_64|
|WrappedPoissonSampler|avgt|1|5|1932.500790|13.686808|us/op|40.3|SPLIT_MIX_64|
|WrappedPoissonSampler|avgt|1|5|1983.504209|21.960997|us/op|57.9|SPLIT_MIX_64|
|PoissonSampler|avgt|1|5|4691.596605|87.865287|us/op|40.3|SPLIT_MIX_64|
|PoissonSampler|avgt|1|5|4742.322853|14.968683|us/op|57.9|SPLIT_MIX_64|
|WrappedPoissonSampler|avgt|1|5|295.546146|1.960616|us/op|5.3|SPLIT_MIX_64|
|WrappedPoissonSampler|avgt|1|5|363.830951|1.737449|us/op|8.5|SPLIT_MIX_64|
|WrappedPoissonSampler|avgt|1|5|972.160790|9.791746|us/op|35.7|SPLIT_MIX_64|
|PoissonSampler|avgt|1|5|685.555186|1.072801|us/op|5.3|SPLIT_MIX_64|
|PoissonSampler|avgt|1|5|753.809844|1.787944|us/op|8.5|SPLIT_MIX_64|
|PoissonSampler|avgt|1|5|1342.103690|8.041261|us/op|35.7|SPLIT_MIX_64|
|SmallMeanPoissonSampler|avgt|1|5|279.659204|1.052956|us/op|5.3|SPLIT_MIX_64|
|SmallMeanPoissonSampler|avgt|1|5|352.463058|1.448008|us/op|8.5|SPLIT_MIX_64|
|SmallMeanPoissonSampler|avgt|1|5|965.718523|4.353908|us/op|35.7|SPLIT_MIX_64|

Rearranged:
||Source||Mean||Name||Relative Score||
|SPLIT_MIX_64|5.3|PoissonSampler|1|
|SPLIT_MIX_64|5.3|WrappedPoissonSampler|0.431104821370281|
|SPLIT_MIX_64|5.3|SmallMeanPoissonSampler|0.407930987484354|
|SPLIT_MIX_64|8.5|PoissonSampler|1|
|SPLIT_MIX_64|8.5|WrappedPoissonSampler|0.482656141858503|
|SPLIT_MIX_64|8.5|SmallMeanPoissonSampler|0.467575557424002|
|SPLIT_MIX_64|35.7|PoissonSampler|1|
|SPLIT_MIX_64|35.7|WrappedPoissonSampler|0.724355947490168|
|SPLIT_MIX_64|35.7|SmallMeanPoissonSampler|0.719555821353863|
|SPLIT_MIX_64|40.3|PoissonSampler|1|
|SPLIT_MIX_64|40.3|LargeMeanPoissonSampler|0.428347284985726|
|SPLIT_MIX_64|40.3|WrappedPoissonSampler|0.411906852336892|
|SPLIT_MIX_64|57.9|PoissonSampler|1|
|SPLIT_MIX_64|57.9|WrappedPoissonSampler|0.41825583590228|
|SPLIT_MIX_64|57.9|LargeMeanPoissonSampler|0.416871608973097|

> PoissonSampler single use speed improvements
> --------------------------------------------
>
>                 Key: RNG-50
>                 URL: https://issues.apache.org/jira/browse/RNG-50
>             Project: Commons RNG
>          Issue Type: Improvement
>    Affects Versions: 1.0
>            Reporter: Alex D Herbert
>            Priority: Minor
>         Attachments: PoissonSamplerTest.java
>
>
> The Sampler architecture of {{org.apache.commons.rng.sampling.distribution}} is nicely
written for fast sampling of small dataset sizes. The constructors for the samplers do not
check the input parameters are valid for the respective distributions (in contrast to the
old {{org.apache.commons.math3.random.distribution}} classes). I assume this is a design choice
for speed. Thus most of the samplers can be used within a loop to sample just one value with
very little overhead.
> The {{PoissonSampler}} precomputes log factorial numbers upon construction if the mean
is above 40. This is done using the {{InternalUtils.FactorialLog}} class. As of version 1.0
this internal class is currently only used in the {{PoissonSampler}}.
> The cache size is limited to 2*PIVOT (where PIVOT=40). But it creates and precomputes
the cache every time a PoissonSampler is constructed if the mean is above the PIVOT value.
> Why not create this once in a static block for the PoissonSampler?
> {code:java}
> /** {@code log(n!)}. */
> private static final FactorialLog factorialLog;
>      
> static 
> {
>     factorialLog = FactorialLog.create().withCache((int) (2 * PoissonSampler.PIVOT));
> }
> {code}
> This will make the construction cost of a new {{PoissonSampler}} negligible. If the table
is computed dynamically as a static construction method then the overhead will be in the first
use. Thus the following call will be much faster:
> {code:java}
> UniformRandomProvider rng = ...;
> int value = new PoissonSampler(rng, 50).sample();
> {code}
> I have tested this modification (see attached file) and the results are:
> {noformat}
> Mean 40  Single construction ( 7330792) vs Loop construction                        
 (24334724)   (3.319522.2x faster)
> Mean 40  Single construction ( 7330792) vs Loop construction with static FactorialLog
( 7990656)   (1.090013.2x faster)
> Mean 50  Single construction ( 6390303) vs Loop construction                        
 (19389026)   (3.034132.2x faster)
> Mean 50  Single construction ( 6390303) vs Loop construction with static FactorialLog
( 6146556)   (0.961857.2x faster)
> Mean 60  Single construction ( 6041165) vs Loop construction                        
 (21337678)   (3.532047.2x faster)
> Mean 60  Single construction ( 6041165) vs Loop construction with static FactorialLog
( 5329129)   (0.882136.2x faster)
> Mean 70  Single construction ( 6064003) vs Loop construction                        
 (23963516)   (3.951765.2x faster)
> Mean 70  Single construction ( 6064003) vs Loop construction with static FactorialLog
( 5306081)   (0.875013.2x faster)
> Mean 80  Single construction ( 6064772) vs Loop construction                        
 (26381365)   (4.349935.2x faster)
> Mean 80  Single construction ( 6064772) vs Loop construction with static FactorialLog
( 6341274)   (1.045591.2x faster)
> {noformat}
> Thus the speed improvements would be approximately 3-4 fold for single use Poisson sampling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message