mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Charlie Hack" <charles.t.h...@gmail.com>
Subject Re: Row Similarity
Date Thu, 14 May 2015 03:29:32 GMT
Hi Jonathan, how do you have the data stored? More info about your setup the better. 


Charlie 









—
Sent from Mailbox




On Wednesday, May 13, 2015 at 23:16, Jonathan Seale <jonathanpseale@gmail.com>, wrote:
Scientists,


I have an astrophysical application for Mahout that I need help with.


I have 1-dimensional stellar spectra for many, many stars. Each spectrum

consists of a series of intensity values, one per wavelength of light. I

need to be able to find the cosine similarity between ALL pairs of stars.

Seems to me this is simply a user-user similarity problem where I have

stars instead of users, wavelengths instead of items, and intensities

instead of ratings/clicks.


But I'm having difficulty using mahout's row similarity package (I'm new to

this, and these days astronomers code pretty exclusively in python). I know

that I must have to 1) create a sparse matrix where each row is a star,

columns are wavelengths, and the values are intensity, and 2) implement row

similarity. But I'm just not sure how to do it. Anyone have a good resource

or be willing to help? I could probably offer some compensation to anyone

that would be willing to provide a little focussed, personalized assistance.


Thanks,

Jonathan
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message