mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Performance gains with changes in distance calculation
Date Fri, 22 May 2009 15:17:34 GMT
I'll add that I am not surprised that is significant overhead, and imagine
most of the use of Objects in the code in Maps and such will need to be
optimized before 1.0.

Doesn't the Apache or Google collections lib also have primitive-value
collections in addition to Trove? I swear one of them does. And they are
Apache licensed.
Having run up hard against performance and memory issues with java.util, I
long since wrote custom implementations whose goal is more to use fewer
memory by using fewer Objects, but that also contributes to speed. Making a
primitivized version of those could prove exceptionally quick. That is in ou

But I favor reusing existing code first.

On May 22, 2009 12:15 PM, "Grant Ingersoll" <> wrote:

On May 22, 2009, at 6:52 AM, Shashikant Kore wrote: > Hi, > > I am working
on clustering a dataset...
Very cool.

> I know by experience that using Integer, Double objects instead of >
primitives is computational...

It's a bit complicated by Trove, b/c that is LGPL.  What that means,
unfortunately, is that we can't check it into our code or distribute it.
 However, if it is in a Maven repo somewhere (I see an old version) than it
is easier to include.  I haven't looked at the code, but is it possible that fills the same role or some other
library out there that has a more friendly license?

Regardless of these, feel free to submit a patch, so we can at least look at
it and have something concrete to discuss in JIRA.


Grant Ingersoll

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message