mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank Wang <wangfan...@gmail.com>
Subject Re: Implementation for Linear Regression
Date Thu, 21 Oct 2010 04:32:39 GMT
Hi Ted,

I've created the JIRA issue at
https://issues.apache.org/jira/browse/MAHOUT-529, will attach what i have
soon.

Do you mean using time as a feature in the logistic regression? I thought
about your suggestion the other day, but I'm not re-calculating the
probability on the old data. After training each night, we only apply the
coefficients on next day's new data. I'm not quite sure how would the decay
function work in this case. Do you have an example?

Thanks


On Wed, Oct 20, 2010 at 8:48 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> Can you open a JIRA and attach a patch.
>
> Your approach seems reasonable so far for the regression.
>
> In terms of how it could be applied, it seems like you are trying to
> estimate a life-span for a posting to model relevance decay.
>
> My own preference there would be to try to estimate relevance (0 or 1)
> using
> logistic regression and then put in various decay functions in as features.
>  The weighted sum of those decay functions is your time decay of relevance
> (in log-odds).
>
> My initial shot at decay functions would include age, square of age and log
> of age.  My guess is that direct age would suffice because of the logistic
> link function which looks like a logarithmic function where your models
> will
> probably live.
>
> On Wed, Oct 20, 2010 at 8:15 PM, Frank Wang <wangfanjie@gmail.com> wrote:
>
> > Hi Ted,
> >
> > thanks for your reply.
> > I'm trying a new model where I want to estimate the output as a timespan
> > quantified in number of seconds, which is not bounded. That's why I think
> > I'd use linear regression instead of logistic regression. (lemme know if
> > i'm
> > wrong)
> >
> > I started on the code yesterday. The new AbstractOnlineLinearRegression
> > class is implementing the OnlineLearner interface. I updated the
> classify()
> > function to use linear model. I tried to follow the format for
> > AbstractOnlineLogisticRegression.
> >
> > I think since linear regression can be implemented w/ sgd, the train()
> > and regularize() functions would look similar. I'm not sure if i'm on the
> > right path. Any advice would be helpful.
> >
> > Thanks
> >
> > On Wed, Oct 20, 2010 at 3:34 PM, Ted Dunning <ted.dunning@gmail.com>
> > wrote:
> >
> > > Frank,
> > >
> > > Sorry I didn't answer your previous email regarding this.
> > >
> > > It sounded to me like your application would actually be happier with a
> > > form
> > > of logistic regression.
> > >
> > > Perhaps we should talk some more about this on the list.
> > >
> > > If you want a normal linear regression, the current OnlineLearner
> > interface
> > > isn't terribly appropriate since it assumes a 1 of n vector target
> > > variable.
> > >
> > > If you were to extend that interface to accept a vector form of target
> > > variable then linear regression would work (and some clever tricks
> would
> > > become possible for logistic regression).
> > >
> > >
> > >
> > > On Wed, Oct 20, 2010 at 1:57 PM, Frank Wang <wangfanjie@gmail.com>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm interested in implementing Linear Regression in Mahout. Who would
> > be
> > > > the
> > > > point person for the algorithm? I'd love to discuss the
> implementation
> > > > details, or to help out if anyone is working on it already :)
> > > >
> > > > Thanks
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message