mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Detecting high bias and variance in AdaptiveLogisticRegression classification
Date Thu, 28 Nov 2013 01:18:41 GMT
On Wed, Nov 27, 2013 at 7:07 AM, Vishal Santoshi <>

> Are we to assume that SGD is still a work in progress and implementations (
> Cross Fold, Online, Adaptive ) are too flawed to be realistically used ?

They are too raw to be accepted uncritically, for sure.  They have been
used successfully in production.

> The evolutionary algorithm seems to be the core of
> OnlineLogisticRegression,
> which in turn builds up to Adaptive/Cross Fold.
> >>b) for truly on-line learning where no repeated passes through the data..
> What would it take to get to an implementation ? How can any one help ?

Would you like to help on this?  The amount of work required to get a
distributed asynchronous learner up is moderate, but definitely not huge.

I think that OnlineLogisticRegression is basically sound, but should get a
better learning rate update equation.  That would largely make the
Adaptive* stuff unnecessary, expecially if OLR could be used in the
distributed asynchronous learner.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message