commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Sterijevski <>
Subject Re: [math] Refactoring multiple regression classes
Date Thu, 14 Jul 2011 02:14:33 GMT

"How exactly do interfaces make the hierarchy flatter in this case?
I agree we should aim for as simple a structure as possible.  The
question is, what is that structure?"

They may or may not make the structure different. Any design we come up with
today is likely to be outmoded in 6 months. (In war throw your battle plans
out the window after the first five minutes.) What I propose is an interface
which is the most minimal set of functionality (identifiable now) that
comprise regression. Over time, as we define more and more implementations
of regression we might see further functionality which is common across
regressions. These methods will migrate to the interface. The interface will
grow organically. More importantly any dependency which is not too picky can
use the interface reference, instead of referencing the concrete class.
Dependencies which care, will and should have intimate knowledge of the
class. Most pieces of code which depend on regression will not. The
interface will not preclude abstract classes.

The way I see it, you would have a core interface:

public interface RegressionIface{
boolean hasIntercept();
long getN();
void addObservation(double[] x, double y);
void addObservation(double[] xy);
 RegressionResults regress()
 RegressionResults regress(int[] vars)

You would then have a subinterface
public interface UpdatingRegression{
 void clear();
void addObservations( double[][] x, double[] y);

Why should code which is running a regression need to know more than this?
If for example, the QR regression and the SVD based regression share common
functionality for manipulating the data incore, then they can inherit from
an abstract base class which implements RegressionIface.  The user in most
cases will not care. He/she may care whether the data is incore or not, but
thats about it.

The real action, in my opinion, is in the RegressionResults class. Here you
might need a bushy, thick tree. All regressions must generation an immutable
RegressionResults. However, that is the minimum info that would be
generated. We might, for example, have ConstrainedRegressionResults.

public class ConstrainedRegressionResults.  extends RegressionResults{
   private double[] lagrangian;


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message