commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luc Maisonobe <>
Subject Re: [math] Re: Longley Data
Date Fri, 15 Jul 2011 06:56:31 GMT
Le 15/07/2011 02:37, Greg Sterijevski a écrit :
> The usual issues with numerical techniques, how you calculate (c * x + d *
> y)/e matters...
> It turns out that religiously following the article and defining c_bar  = c
> / e is not a good idea.
> The Filippelli data is still a bit dicey. I would like to resolve where the
> error is accumulating there as well. That's really the last thing preventing
> me from sending the patch with the Miller-Gentlemen Regression to Phil.

I don't know whether this is feasible in your case, but when trying to 
find this kind of numerical errors, I found useful to just redo the 
computation in parallel to high precision. Up to a few months ago, I was 
simply doing this using emacs (yes, emacs rocks) configured with 50 
significant digits? Now it is easier since we have our own dfp package 
in [math].


> -Greg
> On Thu, Jul 14, 2011 at 1:18 PM, Ted Dunning<>  wrote:
>> What was the problem?
>> On Wed, Jul 13, 2011 at 8:33 PM, Greg Sterijevski<
>>> wrote:
>>> Phil,
>>> Got it! I fit longley to all printed values. I have not broken
>> anything...
>>> I
>>> need to type up a few loose ends, then I will send a patch.
>>> -Greg
>>> On Tue, Jul 12, 2011 at 2:35 PM, Phil Steitz<>
>>> wrote:
>>>> On 7/12/11 12:12 PM, Greg Sterijevski wrote:
>>>>> All,
>>>>> So I included the wampler data in the test suite. The interesting
>>> thing,
>>>> is
>>>>> to get clean runs I need wider tolerances with OLSMultipleRegression
>>> than
>>>>> with the version of the Miller algorithm I am coding up.
>>>> This is good for your Miller impl, not so good for
>>>> OLSMultipleRegression.
>>>>> Perhaps we should come to a consensus of what good enough is? How
>> close
>>>> do
>>>>> we want to be? Should we require passing on all of NIST's 'hard'
>>>> problems?
>>>>> (for all regression techniques that get cooked up)
>>>> The goal should be to match all of the displayed digits in the
>>>> reference data.  When we can't do that, we should try to understand
>>>> why and aim to, if possible, improve the impls.   As we improve the
>>>> code, the tolerances in the tests can be improved.  Characterization
>>>> of the types of models where the different implementations do well /
>>>> poorly is another thing we should aim for (and include in the
>>>> javadoc).  As with all reference validation tests, we need to keep
>>>> in mind that a) the "hard" examples are designed to be numerically
>>>> unstable and b) conversely, a handful of examples does not really
>>>> demonstrate correctness.
>>>> Phil
>>>>> -Greg
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail:
>>>> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message