spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reynold Xin <r...@databricks.com>
Subject Re: Contribute SimRank algorightm to mllib
Date Sat, 11 Jan 2014 06:40:01 GMT
Hi Jerry,

Why don't you submit a pull request and then we can discuss there? If
SimRank is not common enough, we might take the matrix multiplication
method in and merge that. At the very least, even if SimRank doesn't get
merged into Spark, we can include a contrib package or a Wiki page that
links to examples of various algorithms community members have implemented.




On Thu, Jan 9, 2014 at 9:29 PM, Shao, Saisai <saisai.shao@intel.com> wrote:

> Hi All,
>
> We would like to contribute SimRank algorithm to mllib. SimRank algorithm
> used to calculate similarity rank between two objects based on graph
> structure, details can be seen in (
> http://ilpubs.stanford.edu:8090/508/1/2001-41.pdf), here we implemented a
> matrix multiplication method based on basic algorithm, the description of
> matrix multiplication method can be seen in (
> http://www.cse.unsw.edu.au/~zhangw/files/wwwj.pdf) chapter 4.1.
>
> The implementation is abstracted and generalized from our customer's real
> case, we made some tradeoffs to improve the speed and reduce the shuffle
> size. we just wondered if this algorithm be suitable to put into mllib?
> What else should we take care about?
>
> Any suggestion would be really appreciated.
>
> Thanks
> Jerry
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message