commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Herbert <alex.d.herb...@gmail.com>
Subject Re: [statistics-regression] Proposed Regression class/method structure
Date Wed, 23 Oct 2019 10:12:29 GMT

On 23/10/2019 00:13, Gilles Sadowski wrote:
> Hello.
>
> Le mar. 22 oct. 2019 à 21:50, Eric Barnhill <ericbarnhill@gmail.com> a écrit
:
>> I propose the following class structure for commons-statistics-regression.
> Which?
> [Attachment was probably stripped: such should go to a JIRA report.]

Quick first thoughts on the method names:

LinearRegression::RSquared

LogisticRegression::predictionProbs


Are these computing methods or property getters? I assume that all the 
computation is done in the methods:

Regression::fit

Regression::predict(double[])


Thus the methods in the implementation classes access additional results 
specific to the the method. So should be:

LinearRegression::getRSquared

LogisticRegression::getPredictionProbabilities(double[])


>> The interface carried over from commons-math is more of an academic approach to thinking
about regression. For rebooting the library (and I hinted at this when I wrote the tickets
for summer of code) I was hoping to emulate widespread tools like R and scikit-learn, and
consider that "machine learning" is an increasingly popular use of regression. This proposed
structure creates an interface that is not the same as, but will be very friendly to, anyone
coming from R or scikit-learn, or similar tools in JavaScript.
>>
>> There are of course many ways I can see to elaborate this scheme, say using RegressionResult
objects and so forth. But Matrices paired with a double[], returning a double[] of coefficients
or predictions, are likely to be the most common use cases and should be plenty to get started.
> Commenting perhaps too early (not seeing the proposed design), but we broadly
> discussed that the linear algebra API is not easy to get right, and once we "get
> started", the trend is to be stuck with it for ages (related issues
> are among the
> oldest unresolved ones in CM).
>
>> Under the hood I would use the available implementations in commons-math to get up
and running, and worry about improving them later.
> Do you mean port from, or depend on, CM?

I assume that the Matrix object in the API is a new interface for 
commons-statistics. Thus allowing the underlying implementation to be 
pluggable. The initial version could included a shaded library to use 
whatever is appropriate.

Alex


>
> Regards,
> Gilles
>
>> Feedback appreciated,
>> Eric
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message