spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Asher Krim <ak...@hubspot.com>
Subject Re: Spark ML DataFrame API - need cosine similarity, how to convert to RDD Vectors?
Date Tue, 15 Nov 2016 19:25:12 GMT
What language are you using? For Java, you might convert the dataframe to
an rdd using something like this:

df
    .toJavaRDD()
    .map(row -> (SparseVector)row.getAs(row.fieldIndex("columnName")));

On Tue, Nov 15, 2016 at 1:06 PM, Russell Jurney <russell.jurney@gmail.com>
wrote:

> I have two dataframes with common feature vectors and I need to get the
> cosine similarity of one against the other. It looks like this is possible
> in the RDD based API, mllib, but not in ml.
>
> So, how do I convert my sparse dataframe vectors into something spark
> mllib can use? I've searched, but haven't found anything.
>
> Thanks!
> --
> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com relato.io
>



-- 
Asher Krim
Senior Software Engineer

Mime
View raw message