Frank,
After reading carefully again and thinking about some practical examples, I agree that the
current framework has a fundamental and unecessary limitation. The "point mass at 0, continuous
beyond 0" example below does occur in practical applications (e.g. component lifetimes, 0
= defective). As I said in a previous post, the distributions package was designed to house
commonly used "parametric" distributions like the ones that are implemented now; but there
is no reason that the framework could not be used to support any kind of distribution. Therefore,
since the change to add a base interface is small and does not really complicate the structure
or client code, I am +0 for adding it. Any other opinions on this?
More specific comments below.
>> > Well, the problem is this: If I need to create some custom discrete
>> > distribution that doesn't take on integer values, what interface should I
>> > implement? With your model I have no choice but use the
>> > ContinuousDistribution interface even though the distribution *isn't*
>> > continuous. Does that make sense?
>>
>> Can you provide a practical example of this? IIUC, what you are really
>> arguing for is changing the int's in the DiscreteDistribution interface to
>> doubles. This has the advantage of greater generality but makes it
>> slightly less convenient for implementors of the most common discrete
>> distributions, where the values are integers.
>Well, changing the int's in the DiscreteDistribution interface to doubles is
>kind of a workaround, but I don't think it will settle the issue for good,
>see below.
Agreed.
>As for examples, you can take *any* mixed distribution as an example of what I
>mean. Consider a random variable X with domain D that can be partitioned
>into subsets A and B such that
>1. A is a countable set and 0 < P(X is in A) < 1
>2. P(X = x) = 0 for all x in B
>
> How would the distribution for such a random variable be represented in
>your framework?
Not possible.
>As a simple example of this, consider a random variable with the density
>f(x) = 0.5 for x=0
>f(x) = 0.5 for 1<x<2
>How does this distribution fit into your framework? Sure, you could have
>it implement the ContinuousDistribution interface but it *isn't* a
>continuous distribution (in the sense that it doesn't conform to the
>definition of a continuous distribution in probability theory)  and
>then it shouldn't implement an interface called ContinuousDistribution.
>Recall: A random variable is continuous if its distribution function P(X <= x)
>can be expressed as the Riemannintegral of some integrable function
>f: R > [0, infinity)
>The basic problem is that you have an implicit assumption in your
>framework that each and every probability distribution can be classified
>as being either discrete or continuous . That is simply not true.
>Discrete and continuous distributions are really only special cases of
>a broader concept. Aside from that you also have the problem of how to
>handle the case of a discrete distribution that doesn't take on integer
>values.
Ack.
>Note: There are also distributions that are neither discrete, continuous or a
>mixture of the two. For example, there are numerous distributions based upon
>the Cantor ternary sets.
Practical counterexamples like what you have above are more compelling ;)
>The bottom line is that you *cannot* do without a generic
>ProbabilityDistribution interface.
>This interface should expose a method that exists for all and completely
>determines a particular probability distribution, such as the
>distribution function P(X <= x).
>As an easy solution, you could define it as
>
>public interface ProbabilityDistribution {
> public double distributionFunction(double x);
>}
>and have ContinuousDistribution and DiscreteDistribution extend it.
>This should work ok (though the name DiscreteDistribution is misleading)
If we extend the base interface in DiscreteDistribution, this will make that fully generic,
no? Why is the name misleading? I am thinking that this interface would include both int
*and* double argument versions, with the int versions for convenience and ease of use for
the most common case in which the distribution corresponds to an integervalued random variable.
>but if you want a completely generic and typesafe definition you should
>go for something like
>public interface ProbabilityDistribution {
> public Probability distributionFunction(Number x);
>}
I think we can make it work with doubles and don't see a big loss there. I guess this is
where I get off the bus ;)  though I see your point.
Thanks!
Phil
