http://aliasi.com/lingpipe/demos/tutorial/logisticregression/readme.html
Search for intercept.
Another way to look at this is that the model is trying to find a line that
separates your examples. Without the constant (intercept) term, all of
these lines will have to go through the origin. For your data, this isn't
going to find a usable model. Adding the 1 allows the lines to not go
through the origin.
On Thu, Jun 28, 2012 at 11:07 AM, Sean Owen <srowen@gmail.com> wrote:
> (The third dimension, 1, is the bias / intercept term. You will
> probably see this in the literature  go have a look at a basic intro
> to logistic regression. I found Andrew Ng's videos on Coursera a good
> introlevel survey of exactly this.)
>
> On Thu, Jun 28, 2012 at 3:57 PM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
> > On Thu, Jun 28, 2012 at 9:59 AM, damodar shetyo <akshay.shetye@gmail.com
> >wrote:
> >
> >> This post is continuation to another mailing thread thats going on,Sorry
> >> for creating a new thread but i was not getting mails from group
> before .
> >>
> >> Following code was implemented By Ted Dunning .Now i have few questions:
> >>
> >> 1)The point (x,y) has 2 dimensions.But why are we using 3 instead of 2
> >> while creating DenseVector?
> >> Vector v = new DenseVector(3); / / why 3 , why not 2?
> >>
> >> 2) In getVector method why we set v.set(2, 1); ??
> >>
> >> 3)Whats the use of setting lambda?
> >>
> >
> > http://cseweb.ucsd.edu/~saul/teaching/cse291s07/L1norm.pdf
> >
> > (in this next, C is used instead of lambda)
> >
> http://www.ttic.edu/sigml/symposium2011/papers/Moore+DeNero_Regularization.pdf
> >
> > (and in this one, alpha is used)
> > http://en.wikipedia.org/wiki/Least_squares#LASSO_method
> >
> > 4)What happens if i increase or decrease learning rate?
> >>
> >
> > It affects speed to converge. Very high starting point can be useful in
> > some cases, but mostly just makes it take longer to converge. Very low
> > starting point can make convergence fail.
> >
> > http://leon.bottou.org/projects/sgd
>
