lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Dueber <b...@dueber.com>
Subject Trie vs long string for sorting
Date Tue, 23 Jun 2009 15:54:45 GMT
I've having trouble understanding how the Trie type compares (speed- and
memory-wise) with dealing with long *string* (as opposed to integers).

My data are library call numbers, normalized to be comparable, resulting in
(maximum) 21-character strings of the form "RK 052180H359~999~999"

Now, these are fine -- they work for sorting and ranges and the whole thing,
but right now I can't use them because I've got two or three for each of my
6M documents and on a 32-bit machine I run out of heap.

Another option would be to turn them into longs (using roughly 56 bits of
the 64 bit space) and use a trie type. Is there any sort of a win involved
there?

-- 
Bill Dueber
Library Systems Programmer
University of Michigan Library

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message