commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Sterijevski <>
Subject Re: [math] Refactoring multiple regression classes
Date Thu, 14 Jul 2011 02:17:21 GMT
Sorry, sent the previous email without finishing (aparently gmail is not
idiot proof...)


public class ConstrainedRegressionResults.  extends RegressionResults{
   private double[] lagrangian;
   private double[] varcov;
   public ConstrainedRegressionResults( double[] lagrangian, double[]
varcov, ...){
     super( ... );
     //copy over the data

The real growth and layering would occur in the results. Almost everything
in the linear regression world revolves around the sum of squares matrix or
its decomposition. The real action is what you do with it.


On Wed, Jul 13, 2011 at 9:14 PM, Greg Sterijevski <>wrote:

> Phil,
> "How exactly do interfaces make the hierarchy flatter in this case?
> I agree we should aim for as simple a structure as possible.  The
> question is, what is that structure?"
> They may or may not make the structure different. Any design we come up
> with today is likely to be outmoded in 6 months. (In war throw your battle
> plans out the window after the first five minutes.) What I propose is an
> interface which is the most minimal set of functionality (identifiable now)
> that comprise regression. Over time, as we define more and more
> implementations of regression we might see further functionality which is
> common across regressions. These methods will migrate to the interface. The
> interface will grow organically. More importantly any dependency which is
> not too picky can use the interface reference, instead of referencing the
> concrete class. Dependencies which care, will and should have intimate
> knowledge of the class. Most pieces of code which depend on regression will
> not. The interface will not preclude abstract classes.
> The way I see it, you would have a core interface:
> public interface RegressionIface{
> boolean hasIntercept();
> long getN();
> void addObservation(double[] x, double y);
> void addObservation(double[] xy);
>  RegressionResults regress()
>  RegressionResults regress(int[] vars)
> }
> You would then have a subinterface
> public interface UpdatingRegression{
>  void clear();
> void addObservations( double[][] x, double[] y);
> }
> Why should code which is running a regression need to know more than this?
> If for example, the QR regression and the SVD based regression share common
> functionality for manipulating the data incore, then they can inherit from
> an abstract base class which implements RegressionIface.  The user in most
> cases will not care. He/she may care whether the data is incore or not, but
> thats about it.
> The real action, in my opinion, is in the RegressionResults class. Here you
> might need a bushy, thick tree. All regressions must generation an immutable
> RegressionResults. However, that is the minimum info that would be
> generated. We might, for example, have ConstrainedRegressionResults.
> public class ConstrainedRegressionResults.  extends RegressionResults{
>    private double[] lagrangian;
> }

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message