Hi.
Le ven. 10 mai 2019 à 15:53, Alex Herbert <alex.d.herbert@gmail.com> a écrit :
>
>
> On 10/05/2019 14:27, Gilles Sadowski wrote:
> > Hi Alex.
> >
> > Le ven. 10 mai 2019 à 13:57, Alex Herbert <alex.d.herbert@gmail.com> a écrit
:
> >> Can I get a review of the PR for RNG101 please.
> > Thanks for this work!
> >
> > I didn't go into the details; however, I see many fields and methods like
> > table1 ... table5
> > fillTable1 ... fillTable5
> > getTable1 ... getTable5
> > Wouldn't it be possible to use a 2D table:
> > table[5][];
> > so that e.g. only one "fillTable(int tableIndex, /* other args */)" method
> > is necessary (where "tableIndex" runs from 0 to 4)?
>
> Yes. The design is based around using 5 tables as per the example code.
>
> The sample() method knows which table it needs so it can directly jump
> to the table in question. I'd have to look at the difference in speed
> when using a 2D table as you are adding another array access but
> reducing the number of possible method calls (although you still need a
> method call). Maybe this will be optimised out by the JVM.
>
> If the speed is not a factor then I'll rewrite it. Otherwise it's
> probably better done for speed as this is the entire point of the
> sampler given it disregards any probability under 2^31 (i.e. it's not a
> perfectly fair sampler).
>
> Note that 5 tables are needed for 5 hex digits (base 2^6). The paper
> states using 3 tables of base 2^10 then you get a speed increase
> (roughly 1.16x) at the cost of storage (roughly 9x). Changing to 2
> tables of base 2^15 does not make it much faster again.
>
> I'll have a rethink to see if I can make the design work for different
> base sizes.
That could be an extension made easier with the 2D table, but
I quite agree that given the relatively minor speed improvement
to be expected, it is not the main reason; the rationale was just to
make the code a more compact and a little easier to grasp (IMHO).
Gilles
>
> >
> > The diff for "DiscreteSamplersList.java" refers to
> > MarsagliaTsangWangBinomialSampler
> > but
> > MarsagliaTsangWangSmallMeanPoissonSampler
> > seems to be missing.
>
> Oops, I missed adding that back. I built the PR from code where I was
> testing lots of implementations.
>
> I've just added it back and it is still passing locally. Travis should
> see that too as I pushed the change to the PR.
>
> >
> > Regards,
> > Gilles
> >
> >> This is a new sampler based on the source code from the paper:
> >>
> >> George Marsaglia, Wai Wan Tsang, Jingbo Wang (2004)
> >> Fast Generation of Discrete Random Variables.
> >> Journal of Statistical Software. Vol. 11, Issue. 3, pp. 111.
> >>
> >> https://www.jstatsoft.org/article/view/v011i03
> >>
> >> The code has no explicit licence.
> >>
> >> The paper states:
> >>
> >> "We have provided C versions of the two methods described here, for
> >> inclusion in the “Browse
> >> files”section of the journal. ... You may then want to examine the
> >> components of the two files, for illumination
> >> or for extracting portions that might be usefully applied to your
> >> discrete distributions."
> >>
> >> So I assuming that it can be incorporated with little modification.
> >>
> >> The Java implementation has been rewritten to allow the storage to be
> >> optimised for the required size. The generation of the tables has been
> >> adapted appropriately and checks have been added on the input parameters
> >> to ensure the sampler does not generate exceptions once constructed (I
> >> found out the hard way that the original code was not entirely correct).
> >>
> >> Thanks.
> >>
> >> Alex

To unsubscribe, email: devunsubscribe@commons.apache.org
For additional commands, email: devhelp@commons.apache.org
