spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Debasish Das <>
Subject Re: GraphX implementation of ALS?
Date Wed, 27 May 2015 03:24:05 GMT
In general for implicit feedback in als you have to do a blocked gram
matrix calculation which might not fit in graphx flow and lot of blocked
operations can be used...but if your loss is likelihood or kl divergence or
just simple sgd update rules and not least square then graphx idea makes

Lda flow uses similar idea as the loss function is defined on sparse
On May 26, 2015 4:46 PM, "Ankur Dave" <> wrote:

> This is the latest GraphX-based ALS implementation that I'm aware of:
> When I benchmarked it last year, it was about twice as slow as MLlib's
> ALS, and I think the latter has gotten faster since then. The performance
> gap is because the MLlib version implements some ALS-specific optimizations
> that are hard to do using GraphX, such as storing the edges twice
> (partitioned by source and by destination) to reduce communication.
> Ankur <>
> On Tue, May 26, 2015 at 3:36 PM, Ben Mabey <> wrote:
>> I've heard in a number of presentations Spark's ALS implementation was
>> going to be moved over to a GraphX version. For example, this
>> presentation on GraphX
>> <>(slide
>> #23) at the Spark Summit mentioned a 40 LOC version using the Pregel API.
>> Looking at the ALS source on master
>> <>
>> it looks like the original implementation is still being used and no use of
>> GraphX can be seen. Other algorithms mentioned in the GraphX presentation
>> can be found in the repo
>> <>
>> already but I don't see ALS. Could someone link me to the GraphX version
>> for comparison purposes?  Also, could someone comment on why the the newer
>> version isn't in use yet (i.e. are there tradeoffs with using the GraphX
>> version that makes it less desirable)?

View raw message