On Tue, Oct 20, 2009 at 7:18 PM, prasenjit mukherjee
wrote:
> On Tue, Oct 20, 2009 at 10:22 PM, Ted Dunning
> wrote:
> > This is *exactly* the problem with LDA. You can try putting a logistic
> > regression step in the way to combine the positive or negative values
> into a
> > [0,1] value.
>
> Thanks for the pointer, Also can you explain ( or refer an article )
> it a little bit on how to use log regression to get a [0,1] value out
> of U/V vectors.
>
Btw... I mis-wrote in the quoted text. The problem of negative values is
the problem with LSA not LDA.
The trick for getting [0,1] continuous values out of LSA isn't much of a
trick. Just use LSA coordinates as the input to a logistic regression (or
several of them) and train them to whatever objective value such as category
membership that you might like.
Even simpler, simply transform the outputs using a soft-max operator. If
x_i is the i-th LSA coordinate and y_i is the [0,1] output, use
y_i = exp(a * x_i) / sum_j exp(a * x_j)
Pick the value of a to taste. What this does is emphasize the largest of
the x values more for large a and less for smaller a. It also ensures that
all y values are in the range (0, 1).
Normally this is done to transform the full Cartesian plane to unit simplex
coordinates to avoid boundary problems in algorithms such as Metropolis. It
should work almost as well with your problem.
> >
> > Or you could try LDA which is, essentially, a probabilistic version of
> SVD
> > that gives you exactly what you want.
>
> That was my first attempt. But the data is very sensitive to
> overfitting/underfitting. And since I dont even know the approximate
> L ( no. of latent vars ) it is becoming difficult for me to use
> LDA/PLSI/approximate-SVD.
>
Ahh... well, there is good news for you. Non parametric LDA exists which
doesn't require you to pick the number of latent variables in advance.
Neal and Gahramani have done some work on this:
http://www.csri.toronto.edu/~roweis/papers/bmf_nips_final.pdf
And Jordan, Blei, Teh and the other usual suspects have also done some nice
work on it: http://www.cse.buffalo.edu/faculty/mbeal/papers/hdp.pdf
Here is a lecture by Michael Jordan:
http://videolectures.net/icml05_jordan_dpcrp/
> -Prasen
>
> >
> > On Tue, Oct 20, 2009 at 4:01 AM, prasenjit mukherjee
> > wrote:
> >
> >> Thanks a bunch, I fixed the problem by using Colt.
> >>
> >> Also I am trying to use U/V values to assign probability p(z|u) and
> >> p(z|s). My problem is how do I interpret the -ve U/V values and assign
> >> a +ve probability value for that entry.
> >>
> >> -Prasen
> >>
> >> On Sun, Oct 18, 2009 at 10:58 PM, Ted Dunning
> >> wrote:
> >> > I have not worked with lingpipe, but ...
> >> >
> >> > When I follow the steps you are taking using R, I get this:
> >> >
> >> > *> docs=data.frame(d0=c(2,2,0,0), d1=c(2,2,0,0), d2=c(0,0,2,2),
> >> > row.names=c("t0","t1","t2","t3"))
> >> >> docs
> >> > d0 d1 d2
> >> > t0 2 2 0
> >> > t1 2 2 0
> >> > t2 0 0 2
> >> > t3 0 0 2
> >> >> svd(docs)
> >> > $d
> >> > [1] 4.000000 2.828427 0.000000
> >> >
> >>
> >>
> >>
> >
> >
> >
> > --
> > Ted Dunning, CTO
> > DeepDyve
> >
>
--
Ted Dunning, CTO
DeepDyve