mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <>
Subject Re: LDA from Lucene Indexes
Date Wed, 04 May 2011 18:31:42 GMT
On Wed, May 4, 2011 at 10:46 AM, Ted Dunning <> wrote:

> Pipelining is good for abstraction and really bad for performance (in the
> map-reduce world).
> My thought is that we could have a multipurpose tool.  Input would be a
> lucene index and the program would read term vectors or original text as
> available.  Output would be either sequence file full of text or sequence
> file full of vectors.

Ok, sure, then this is modifying the lucene.vectors code, not the
seq2sparse code, right?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message