mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kris Jack <>
Subject Generating a Document Similarity Matrix
Date Tue, 08 Jun 2010 13:38:26 GMT
Hi everyone,

I currently use lucene's moreLikeThis function through solr to find
documents that are related to one another.  A single call, however, takes
around 4 seconds to complete and I would like to reduce this.  I got to
thinking that I might be able to use Mahout to generate a document
similarity matrix offline that could then be looked-up in real time for
serving.  Is this a reasonable use of Mahout?  If so, what functions will
generate a document similarity matrix?  Also, I would like to be able to
keep the text processing advantages provided through lucene so it would help
if I could still use my lucene index.  If not, then could you recommend any
alternative solutions please?

Many thanks,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message