lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Liu <>
Subject Lucene and Latent Semantic Indexing
Date Mon, 14 Nov 2005 23:50:44 GMT
I'm currently experimenting with latent semantic indexing techniques and
Lucene. I need to extract term frequencies from a Lucene index and construct
a document/term matrix, then subsequently perform some mathematical
algorithms on this matrix which produces float and potentially negative term
frequency values. Extracting the tf's from the Lucene index is easy. The
hard part is importing the modified tf's back into the index, since in
Lucene, tf's are stored as integer values.

Anybody that knows the Lucene codebase well have any tips? Has anybody even
tried performing LSI on a Lucene index?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message