spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DB Tsai <dbt...@dbtsai.com>
Subject Re: MLlib: Anybody working on hierarchical topic models like HLDA?
Date Thu, 04 Jun 2015 05:00:33 GMT
Is your HDP implementation based on distributed gibbs sampling? Thanks.

Sincerely,

DB Tsai
-------------------------------------------------------
Blog: https://www.dbtsai.com


On Wed, Jun 3, 2015 at 8:13 PM, Yang, Yuhao <yuhao.yang@intel.com> wrote:
> Hi Lorenz,
>
>
>
>   I’m trying to build a prototype of HDP for a customer based on the current
> LDA implementations. An initial version will probably be ready within the
> next one or two weeks. I’ll share it and hopefully we can join forces.
>
>
>
>   One concern is that I’m not sure how widely it will be used in the
> industry or community. Hope it’s popular enough to be accepted by Spark
> MLlib.
>
>
>
> http://www.cs.berkeley.edu/~jordan/papers/hierarchical-dp.pdf
>
> http://jmlr.csail.mit.edu/proceedings/papers/v15/wang11a/wang11a.pdf
>
>
>
> Regards,
>
> Yuhao
>
>
>
> From: Joseph Bradley [mailto:joseph@databricks.com]
> Sent: Thursday, June 4, 2015 7:17 AM
> To: Lorenz Fischer
> Cc: dev@spark.apache.org
> Subject: Re: MLlib: Anybody working on hierarchical topic models like HLDA?
>
>
>
> Hi Lorenz,
>
>
>
> I'm not aware of people working on hierarchical topic models for MLlib, but
> that would be cool to see.  Hopefully other devs know more!
>
>
>
> Glad that the current LDA is helpful!
>
>
>
> Joseph
>
>
>
> On Wed, Jun 3, 2015 at 6:43 AM, Lorenz Fischer <lorenz.fischer@gmail.com>
> wrote:
>
> Hi All
>
>
>
> I'm working on a project in which I use the current LDA implementation that
> has been contributed by Databricks' Joseph Bradley et al. for the recent
> 1.3.0 release (thanks guys!). While this is great, my project requires
> several levels of topics, as I would like to offer users to drill down into
> subtopics.
>
>
>
> As I understand it, Hierarchical Latent Dirichlet Allocation (HLDA) would
> offer such a hierarchy. Looking at the papers and talks by Blei [1,2] and
> Jordan [3], I think I should be able to implement HLDA in Spark using the
> Nested Chinese Restaurant Process (NCRP). However, as I have some time
> constraints, I'm not sure if I will have the time to do it 'the proper way'.
>
>
>
> In any case, I wanted to quickly ask around if anybody is already working on
> this or on some other form of a hierarchical topic model. Maybe I could
> contribute to these efforts instead of starting from scratch.
>
>
>
> Best,
>
> Lorenz
>
>
>
> [1] http://www.cs.princeton.edu/~blei/papers/BleiGriffithsJordan2009.pdf
>
> [2]
> http://papers.nips.cc/paper/2466-hierarchical-topic-models-and-the-nested-chinese-restaurant-process.pdf
>
> [3] https://www.youtube.com/watch?v=PxgW3lOrj60
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message