On 23/10/2019 00:13, Gilles Sadowski wrote:
> Hello.
>
> Le mar. 22 oct. 2019 à 21:50, Eric Barnhill <ericbarnhill@gmail.com> a écrit
:
>> I propose the following class structure for commons-statistics-regression.
> Which?
> [Attachment was probably stripped: such should go to a JIRA report.]
Quick first thoughts on the method names:
LinearRegression::RSquared
LogisticRegression::predictionProbs
Are these computing methods or property getters? I assume that all the
computation is done in the methods:
Regression::fit
Regression::predict(double[])
Thus the methods in the implementation classes access additional results
specific to the the method. So should be:
LinearRegression::getRSquared
LogisticRegression::getPredictionProbabilities(double[])
>> The interface carried over from commons-math is more of an academic approach to thinking
about regression. For rebooting the library (and I hinted at this when I wrote the tickets
for summer of code) I was hoping to emulate widespread tools like R and scikit-learn, and
consider that "machine learning" is an increasingly popular use of regression. This proposed
structure creates an interface that is not the same as, but will be very friendly to, anyone
coming from R or scikit-learn, or similar tools in JavaScript.
>>
>> There are of course many ways I can see to elaborate this scheme, say using RegressionResult
objects and so forth. But Matrices paired with a double[], returning a double[] of coefficients
or predictions, are likely to be the most common use cases and should be plenty to get started.
> Commenting perhaps too early (not seeing the proposed design), but we broadly
> discussed that the linear algebra API is not easy to get right, and once we "get
> started", the trend is to be stuck with it for ages (related issues
> are among the
> oldest unresolved ones in CM).
>
>> Under the hood I would use the available implementations in commons-math to get up
and running, and worry about improving them later.
> Do you mean port from, or depend on, CM?
I assume that the Matrix object in the API is a new interface for
commons-statistics. Thus allowing the underlying implementation to be
pluggable. The initial version could included a shaded library to use
whatever is appropriate.
Alex
>
> Regards,
> Gilles
>
>> Feedback appreciated,
>> Eric
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org
|