Phil Steitz wrote:
> Piotr Kochan'ski wrote:
> > Mark R. Diggory wrote:
>
> >
> > Exactely, but the point is that we have to preserve original/bootstrap
> > values and EmpiricalDistribution is not storing them  internally it keeps
> > data
> > in the array of bins.
>
> My thought was that we could do some things (e.g. estimate confidence
> intervals) without storing the boostrap samples or even the full set of
> bootstrap statistics.
This is not a problem at all. When we initialize EmpiricalDistribution
using load(...) method, we can calculate what we want  we have
data set at that moment.
The problem I see is that we have to a priori specify for which statistics
(bootstrap) confidence interval or standard error would be calculated.
We should not make that decision for the user, so some configuration of
EmpiricalDistribution object would be necessary, e.g.
load(double[][], UnivariateStatistics[])
then all the interesting calculation would be done for provided
UnivariateStatistics. The default choice could be just SummaryStatistics:
load(double[][]){
statisticsToBeBootstrapped[] = All SummaryStatistics
}
If bootstrap samples are not provided, e.g. user uses other
load function, we can provide confidence intervals based on the
normal distribution assumption (for those statistics, for which
it can be calculated).
In fact we could leave the choice which summary statistics should
be calculated to the user at all (e.g. for performance reason  someone
would never be interested in calculating some statistics, but it is done
anyway, which slows down initialization of the object).
load(String, UnivariateStatistics[]) etc.
Then present getSampleStats() method should return
an object which enables access to calculated statistics and/or
the confidence intervals for them.
Piotr

To unsubscribe, email: commonsdevunsubscribe@jakarta.apache.org
For additional commands, email: commonsdevhelp@jakarta.apache.org
