spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yang, Yuhao" <yuhao.y...@intel.com>
Subject RE: MLlib: Anybody working on hierarchical topic models like HLDA?
Date Thu, 04 Jun 2015 09:12:13 GMT
Hi DB Tsai,

Not for now. My primary reference is http://jmlr.csail.mit.edu/proceedings/papers/v15/wang11a/wang11a.pdf
.

And I'm seeking a way to maximum code reuse. Any suggestion will be welcome. Thanks.

Regards,
yuhao

-----Original Message-----
From: DB Tsai [mailto:dbtsai@dbtsai.com] 
Sent: Thursday, June 4, 2015 1:01 PM
To: Yang, Yuhao
Cc: Joseph Bradley; Lorenz Fischer; dev@spark.apache.org
Subject: Re: MLlib: Anybody working on hierarchical topic models like HLDA?

Is your HDP implementation based on distributed gibbs sampling? Thanks.

Sincerely,

DB Tsai
-------------------------------------------------------
Blog: https://www.dbtsai.com


On Wed, Jun 3, 2015 at 8:13 PM, Yang, Yuhao <yuhao.yang@intel.com> wrote:
> Hi Lorenz,
>
>
>
>   I’m trying to build a prototype of HDP for a customer based on the 
> current LDA implementations. An initial version will probably be ready 
> within the next one or two weeks. I’ll share it and hopefully we can join forces.
>
>
>
>   One concern is that I’m not sure how widely it will be used in the 
> industry or community. Hope it’s popular enough to be accepted by 
> Spark MLlib.
>
>
>
> http://www.cs.berkeley.edu/~jordan/papers/hierarchical-dp.pdf
>
> http://jmlr.csail.mit.edu/proceedings/papers/v15/wang11a/wang11a.pdf
>
>
>
> Regards,
>
> Yuhao
>
>
>
> From: Joseph Bradley [mailto:joseph@databricks.com]
> Sent: Thursday, June 4, 2015 7:17 AM
> To: Lorenz Fischer
> Cc: dev@spark.apache.org
> Subject: Re: MLlib: Anybody working on hierarchical topic models like HLDA?
>
>
>
> Hi Lorenz,
>
>
>
> I'm not aware of people working on hierarchical topic models for 
> MLlib, but that would be cool to see.  Hopefully other devs know more!
>
>
>
> Glad that the current LDA is helpful!
>
>
>
> Joseph
>
>
>
> On Wed, Jun 3, 2015 at 6:43 AM, Lorenz Fischer 
> <lorenz.fischer@gmail.com>
> wrote:
>
> Hi All
>
>
>
> I'm working on a project in which I use the current LDA implementation 
> that has been contributed by Databricks' Joseph Bradley et al. for the 
> recent
> 1.3.0 release (thanks guys!). While this is great, my project requires 
> several levels of topics, as I would like to offer users to drill down 
> into subtopics.
>
>
>
> As I understand it, Hierarchical Latent Dirichlet Allocation (HLDA) 
> would offer such a hierarchy. Looking at the papers and talks by Blei 
> [1,2] and Jordan [3], I think I should be able to implement HLDA in 
> Spark using the Nested Chinese Restaurant Process (NCRP). However, as 
> I have some time constraints, I'm not sure if I will have the time to do it 'the proper
way'.
>
>
>
> In any case, I wanted to quickly ask around if anybody is already 
> working on this or on some other form of a hierarchical topic model. 
> Maybe I could contribute to these efforts instead of starting from scratch.
>
>
>
> Best,
>
> Lorenz
>
>
>
> [1] 
> http://www.cs.princeton.edu/~blei/papers/BleiGriffithsJordan2009.pdf
>
> [2]
> http://papers.nips.cc/paper/2466-hierarchical-topic-models-and-the-nes
> ted-chinese-restaurant-process.pdf
>
> [3] https://www.youtube.com/watch?v=PxgW3lOrj60
>
>
Mime
View raw message