mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Federico Castanedo <castanedof...@gmail.com>
Subject Re: Yahoo's LDA code
Date Fri, 10 Jun 2011 13:49:35 GMT
Hi all,

i got through the referenced paper and seems that besides all the
distributed tasks the way the inference for \alpha and \beta
is performed was the key element on improved the LDA trained performance.
They use SGD for the hyperparameter adjustment of \alpha.

bests,
Federico

2011/6/10 Jake Mannix <jake.mannix@gmail.com>

> It's all c++, custom distributed processing, custom distributed
> coordination
> and storage.
>
> We can certainly try to port over the algorithmic ideas, but the
> distributed
> systems stuff would be a significant departure from our current setup -
> it's
> not a web service and it's not hadoop, and it's not a command line utility
> -
> it's a cluster of long-running processes all intercommunicating.  Sounds
> awesome, but that's a way's off from where we are now.
>
>  -jake
>
> On Thu, Jun 9, 2011 at 7:52 PM, Stanley Xu <wenhao.xu@gmail.com> wrote:
>
> > Awesome! Guess it would be much faster than then current version in
> Mahout.
> > Is that possible to just use this version in mahout?
> >
> > On Fri, Jun 10, 2011 at 8:12 AM, <jeremy@lewi.us> wrote:
> >
> > > Yahoo released its hadoop code for LDA
> > >
> > >
> >
> http://blog.smola.org/post/6359713161/speeding-up-latent-dirichlet-allocation
> > >
> > >
> > >
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message