mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: SVD Memory Reqs
Date Tue, 06 Jul 2010 21:35:43 GMT
My unsubstantiated guess is that most of these could actually be replaced
with random vectors with no impact.  All of the studies I have seen that
measure how many singular vectors are necessary change the dimensionality as
they test different numbers.  I think it would be better to keep the
dimensionality constant and just change how many vectors are actually
singular vectors and how many are random.

On Tue, Jul 6, 2010 at 2:27 PM, Jake Mannix <> wrote:

> My rule of thumb has been that for text type stuff (i.e. LSI/LSA),
> something
> around 200-400 is the most you'll ever need.  For smaller corpora and/or
> vocabularies, even below the bottom end of this range is fine

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message