commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Sterijevski <>
Subject Re: (MATH-607) Current Multiple Regression Object
Date Wed, 06 Jul 2011 16:29:12 GMT
At Ted's suggestion I looked at LSMR in mahout. In general, I have no
complaints about the algorithm or how it is coded up. I have seen the
algorithm in the PrimalDual solver that Micheal Saunders et al cooked up. I
believe the solver is part of the COIN project. I have nothing but praises
for it.

However, what I contacted Phil about was setting up some interfaces to
define a general contract so that we could code up different ways of
performing OLS. To wit, here is what I had in  mind:

public interface UpdatingLinearRegression {
    public long getNobs();
    public void addData( double[] x, double y);
    public void addData( double[][] x, double[] y);
    public void clear();
    public RegressionResults regress()  throws MathException;
    public RegressionResults regress(int[] variablesToInclude)  throws

The other interface is:

public interface RegressionResults {
    public double getParameterEstimate(int index) throws
    public double[] getParameterEstimates();
    public double getStdErrorOfEstimate(int index) throws
    public double[] getStdErrorOfEstimates();
    public boolean isRedundant(int index) throws IndexOutOfBoundsException,
    public boolean[] getRedundant();
    public int getNumberOfParameters();
    public long getNobs();
    public double getTotalSumSquares();
    public double getRegressionSumSquares();
    public double getErrorSumSquares();
    public double getMeanSquareError();
    public double getRSquared();

Borrowing liberally from the SimpleRegressionClass,  the above functionality
describes most of what a user would expect from a classical regression
analysis. What the interface buys us is the ability to support the many ways
to generate the results above: QR factorizations, in place gaussian
elimination, incremental SVD and so forth.



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message