spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Ash <and...@andrewash.com>
Subject Re: Computing cosine similiarity using pyspark
Date Fri, 23 May 2014 15:37:18 GMT
Hi Jamal,

I don't believe there are pre-written algorithms for Cosine similarity or
Pearson Porrelation in PySpark that you can re-use. If you end up writing
your own implementation of the algorithm though, the project would
definitely appreciate if you shared that code back with the project for
future users to leverage!

Andrew


On Thu, May 22, 2014 at 10:49 AM, jamal sasha <jamalshasha@gmail.com> wrote:

> Hi,
>   I have bunch of vectors like
> [0.1234,-0.231,0.23131]
> .... and so on.
>
> and  I want to compute cosine similarity and pearson correlation using
> pyspark..
> How do I do this?
> Any ideas?
> Thanks
>

Mime
View raw message