What language are you using? For Java, you might convert the dataframe to an rdd using something like this:

    .map(row -> (SparseVector)row.getAs(row.fieldIndex("columnName")));

I have two dataframes with common feature vectors and I need to get the cosine similarity of one against the other. It looks like this is possible in the RDD based API, mllib, but not in ml.

So, how do I convert my sparse dataframe vectors into something spark mllib can use? I've searched, but haven't found anything.


