lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Engels" <>
Subject RE: Lucene vs. Ruby/Odeum
Date Thu, 02 Jun 2005 13:59:05 GMT
There are still very SIGNIFICANT problems with his tests.

1. The environment is not "real", except for possibly desktop searching.
Whether the JVM needs 64m or 128m to perform adequately is immaterial, given
the price of RAM and the ease of expansion. It would be akin to saying
"let's all try to write programs that work with 64k of memory", why
needlessly constrain yourself? I would take performance, readability,
maintainability, and reliability over memory consumption any day. If the
point of the test is to show that a Java based searching system needs more
memory than a script language on top of a C db library, who cares? The
smallest possible Java program I can write, shows a VM size of 8.5mb under
Windows XP - which is larger than the whole of Odeum/Ruby. There is a lot
that the Java system provides that isn't useful in this particular use case,
but I run Lucene in a multithreaded, server environment, with a test index
of 350mb, and I can run it in as little as 19mb - but why would I want to?

2. The search is always for the same word. The Odeum database based version
will almost certainly cache all the required data and index blocks in memory
after the first run, avoiding all calls to the OS. Since Lucene performs no
local caching (without my mods), it will ALWAYS require trips to the OS.
Also, each run of Lucene is going to generate garbage without question. A
properly designed non-Java db will almost certainly generate no increased
memory usage in the constrained case. Running the tests using multiple
threads on random words would be far more interesting.

Lastly, for what's it's worth (and that's probably not much!) - if Odeum was
the "better search engine", you could do a Java -> Odeum mapping and I
GUARANTEE the Java implementation using the latest JIT JVMs compilers will
be faster than the Ruby one. Also, where's my cross platform GUI for
displaying the search results?

The best developers attempt to use the right tool for the job. Ruby is a
GREAT scripting language, and is perfect for all the things scripting
languages are good for. Let's leave it at that.

-----Original Message-----
From: Erik Hatcher []
Sent: Thursday, June 02, 2005 5:09 AM
Subject: Re: Lucene vs. Ruby/Odeum

Zed has updated his second part with more experiments with different
JVM's and memory settings:

On Jun 2, 2005, at 12:27 AM, Robert Engels wrote:
> I read all of Zed's posts on the subject and I feel he certainly
> presents a
> strong anti-Java

Most definitely an anti-Java leaning - but at least he's working on
being objective about it by measuring things :)

> , if not anti-Lucene bias - maybe just pro Ruby.

He's quite pro-Lucene, and most definitely pro-Ruby.  I consider
myself in those categories myself.

> If you do not even adhere to the principle designer's "guidelines
> to proper
> usage", your tests are meaningless. It's akin to using a new flat
> screen
> monitor and claiming "boy, it has a fuzzy picture", because you didn't
> follow the instructions that said "remove protective film before
> using".

I concur with your sentiment and I've done what I can via e-mail with
him to educate him on my experience with Lucene and JVM garbage
collection.  I'd encourage anyone who has the the time and
inclination to take him up on the request to show how to do it better
since he's made his code available.

> Zed is using a very constrained test - which is probably very
> the real world of server based systems, to attempt to discern the
> relative
> performance characteristics of Lucene/Java/Ruby/etc. The tests may be
> applicable in his poorly designed environment, but he presents his
> limited
> finding as "gospel", and that it should hold true in all cases. I
> quote...
> "For the people who have no clue (also known as "Executives")
> here's the
> information you need to tell all your employees they need to adopt the
> latest and greatest thing without ever having to understand
> anything you
> read. Cheaper than an article in CIO magazine and even has big
> words like
> "standard deviation"." and then goes on to present his "statistically
> correct" performance numbers.

Don't get me wrong - Zed is using inflammatory language.  We should
work to not lower ourselves to speaking in that same tone but rather
objectively and nicely point out the errors of his ways.  He's open
to that despite his caustic tone - at least from the e-mail exchanges
I've had with him.


To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message