mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <jake.man...@gmail.com>
Subject Re: Using Restricted Boltzmann for clustering
Date Sat, 05 Dec 2009 05:21:21 GMT
On Fri, Dec 4, 2009 at 8:07 PM, prasenjit mukherjee <
pmukherjee@quattrowireless.com> wrote:

> I am indeed learning via CD technique, but using only a single layer,
> where the computation of neurons toggles between the same set of
> visible and hidden layers. Guess I was too optimistic and was
> expecting results  in first RBM itself.
>

  Have you read Semantic
Hashing<http://www.cs.utoronto.ca/%7Ehinton/absps/sh.pdf>.
(pdf link) (Salakhutdinov and Hinton) ?  He gives
a good explanation of why the multi-layered approach is so necessary.  One
layer
of an RBM is not much more than a simple old-fashioned single-layer neural
net,
which from what I remember, has never been that good at doing this kind of
thing.
Not only are multiple layers needed, but fine tuning by old-fashioned
gradient
descent after the CD pretraining is also necessary.

  -jake

I believe in stacked RBMs you repeat the same thing for more than one
> layer, where results of current hidden layers get passed on as visible
> layer to a new RBM with new hidden layer and you again do Contrastive
> Divergence in that new RBM. You could possibly have different number
> of hidden neurons at each layer. Makes sense to keep reducing the
> number of hidden neurons at each subsequent layers.
>
> Anyways, will see if I can try out with multiple layers and see any
> improvements.
>
> -Thanks,
> Prasen
>
> On Fri, Dec 4, 2009 at 11:25 PM, Jake Mannix <jake.mannix@gmail.com>
> wrote:
> > Prasen,
> >
> >  I thought the whole point of doing the RBM approach to autoencoders /
> > dimensional
> > reduction was to do the stacked approach, since you don't need to do full
> > convergence
> > layer by layer, but instead do the layer-by-layer "contrastive
> divergence"
> > technique
> > which Hinton advocates, and then do fine-tuning at the end?  I wouldn't
> > imagine
> > you'd get very good relevance on a single layer.
> >
> >  -jake
> >
> > On Fri, Dec 4, 2009 at 8:37 AM, prasenjit mukherjee <
> > pmukherjee@quattrowireless.com> wrote:
> >
> >> I did try out on some sample data where my visible layer was Linear
> >> and hidden layer was StochasticBinary.   Using a single layer RBM
> >> didnt give me great results. I guess I should try out the stacked RBM
> >> approach.
> >>
> >> BTW, Anybody used single layer RBM on a doc X term probability matrix
> >> ( aka Continuous visible layer )  with values 0-1 for collaborative
> >> filtering ?
> >>
> >> -Prasen
> >>
> >> On Thu, Dec 3, 2009 at 12:40 AM, Olivier Grisel
> >> <olivier.grisel@ensta.org> wrote:
> >> > 2009/12/2 Jake Mannix <jake.mannix@gmail.com>:
> >> >> Prasen,
> >> >>
> >> >>  I was just talking about this on here last week.  Yes, RBM-based
> >> >> clustering can be viewed as
> >> >> a nonlinear SVD.  I'm pretty interested in your findings on this. 
Do
> >> you
> >> >> have any RBM code you
> >> >> care to contribute to Mahout?
> >> >
> >> > Hi,
> >> >
> >> > I have some C + python code for stacking autoencoders which share
> >> > similar features as DBN (stacked RBM) here:
> >> > http://bitbucket.org/ogrisel/libsgd/wiki/Home
> >> >
> >> > This is still pretty much work in progress, I will let you know when I
> >> > have easy to run sample demos.
> >> >
> >> > However, this algo is not trivially mapreducable but I plan to
> >> > investigate on that matters in the coming weeks. Would be nice to have
> >> > a pure JVM version too. I am also planning to play with clojure +
> >> > incanter (with the parallelcolt library as a backend for linear
> >> > algebra) to make it easier to work with Hadoop.
> >> >
> >> > --
> >> > Olivier
> >> > http://twitter.com/ogrisel - http://code.oliviergrisel.name
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message