commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikkel Meyer Andersen <>
Subject Re: [math] Generate random data using the Inverse CDF Method?
Date Mon, 02 Nov 2009 21:06:53 GMT
I agree, Ted. It seems like reasonable arguments.

Another way of implementing the functionality is to call some
nextSample(Abstract{Continuous, Discrete}Distribution) in the Random
package from the Distribution-classes (and
nextSample(ExponentialDistribution) equivalent to nextExponential for
an optimised implementation). Would that be a better way of doing it?
I think this is conceptually wrong, but I'm ready to do this if it
means it goes through.

Cheers, Mikkel.

2009/11/3 Ted Dunning <>:
> We should probably say which parts of the problem are important to us.  It
> begins to sound like we each care about slightly different aspects of the
> problem.
> The only points that I really care about are:
> - the user should have available some obvious way to sample from a
> distribution as a method on the distribution itself.  This need is not met
> by having a completely separate class in a different package that the user
> must somehow intuit the existence of.
> - the user should have the widest possible number of distributions that have
> *some* kind of sampling procedure that produces accurate samples.  Morevoer,
> this wide availability should happen very soon.
> Note that neither of these points really implies much about implementation
> other than where the user of commons-math can find an access to
> implementations and that we implement something across many distributions
> very soon.
> These are points that I explicitly don't care about:
> - should the implementation be based on inverse cumulative distributions if
> available?  If there is another way to get lots of sampling algorithms
> implemented, I am all for it.  Marsaglia's table method for discrete
> distributions is an interesting option for some cases.  There may be other
> algorithms that could have wide applicability.  Multiple approaches might be
> a good idea, special purpose samplers for some cases (like normal or
> exponential distributions), kind of general methods like Marsaglia's method
> where it can be done.  If all of the common cases have special purpose, high
> quality generators, I don't see a problem with letting all of the other
> distributions that we haven't considered yet fall back to inverse cumulative
> methods.  But all of these considerations are not what I really care about.
> I only care about very wide availability of *some* sampling method.
> - should there be random number generators that provide more
> generality/flexibility/alternative implementations for sampling for various
> distributions.  This is an implementation question that can be answered many
> ways.  I think that lots of alternatives are good.  I even think that having
> pure implementations of one method or another might be an excellent way to
> allow us to stitch together the sampling available by default from the
> distribution.  All of these consideration, however, are not what I really
> care about.  What I care about is that all of these implementations should
> be ignorable by a less than devoted user of commons math.
> Now, it seems to me that the points that Phil cares most about fall mostly
> into the set of things that I care less about.  Moreover, some of the
> opinions that Phil has expressed have been stated in ways that I may have
> misinterpreted.  For instance, it sounded to me like Phil was saying that we
> shouldn't even implement the inverse cumulative sampler.  On reflection, I
> think that his real point is that we should not use the inverse cumulative
> method where there are better methods, especially if we already have
> implementations of the better methods.
> Likewise, it sounded to me like Phil was saying that we absolutely shouldn't
> allow easy access to a community consensus sampling algorithm from the
> distribution.  On further reflection, I think that his real point is that we
> simply should not be doing most implementation in the distribution function
> class, but should have a separate package to separate all that work away
> from the view of the users.  That sounds like a really good idea, if only to
> decrease the noise for the casual user of the distribution classes.
> This sounds like the germ of compromise.
> On Mon, Nov 2, 2009 at 3:03 AM, Phil Steitz <> wrote:
>>  I just don't like your suggested implementation and package
>> placement.  I proposed an alternative (a generic method added
>> somewhere in the random package), which you did not like. There are
>> no doubt other better ways to do this.  Perhaps others have ideas?
> --
> Ted Dunning, CTO
> DeepDyve

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message