commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <>
Subject Re: [math] Re: Longley Data
Date Tue, 12 Jul 2011 19:35:20 GMT
On 7/12/11 12:12 PM, Greg Sterijevski wrote:
> All,
> So I included the wampler data in the test suite. The interesting thing, is
> to get clean runs I need wider tolerances with OLSMultipleRegression than
> with the version of the Miller algorithm I am coding up.
This is good for your Miller impl, not so good for
> Perhaps we should come to a consensus of what good enough is? How close do
> we want to be? Should we require passing on all of NIST's 'hard' problems?
> (for all regression techniques that get cooked up)
The goal should be to match all of the displayed digits in the
reference data.  When we can't do that, we should try to understand
why and aim to, if possible, improve the impls.   As we improve the
code, the tolerances in the tests can be improved.  Characterization
of the types of models where the different implementations do well /
poorly is another thing we should aim for (and include in the
javadoc).  As with all reference validation tests, we need to keep
in mind that a) the "hard" examples are designed to be numerically
unstable and b) conversely, a handful of examples does not really
demonstrate correctness. 

> -Greg

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message