commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Herbert <alex.d.herb...@gmail.com>
Subject Re: [statistics-regression] Proposed Regression class/method structure
Date Mon, 28 Oct 2019 21:12:10 GMT


> On 28 Oct 2019, at 17:55, Eric Barnhill <ericbarnhill@gmail.com> wrote:
> 
> Here is a schematic for how the interface might be made more abstract.
> 
> https://imgur.com/a/izx5Xkh <https://imgur.com/a/izx5Xkh>

Regression and RegressionResults both have a predict method with the same signature.

> 
> In this case, we may want to just implement the simplest case, using Matrix
> and double[], for now.
> 
> In principle the RegressionMetric class could extend a Metrics class later.
> 
> Do you feel this would set up the library better for the future?

I know that the use case for a diagonal matrix only was put forward previously. So I can see
the Matrix abstraction as useful. But should this then be Matrix<E>.

You have Vector<E> for most methods to pass a 1D set of numeric data. But the RegressionData.of
method accepts a double[]. This should also be Vector<E>.

I am assuming that Vector is an abstraction of a 1D data object.

What are the possible values for <E>? 

Double
double[]
Possibly complex numbers.
… ?

Such that Matrix<E> and Vector<E> just denote that the analysis is done on a matrix
and vector of the same type. 

This would then require abstraction of all operations required by the regression objects such
as:

Vector<E> = Matrix<E>.multiply(Vector<E>)
Matrix<E> = Matrix<E>.transpose()

Etc.

Then you start by using concrete classes for Matrix<double[]> and Vector<double[]>.

I see that the nomenclature Matrix<double[]> is a bit of a misnomer as it may be confused
for Matrix<double[][]>. So this would be documented as <E> is the type of the
entire data for a single matrix dimension. The matrix is actually a E[].


> 
> Eric


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message