lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen ...@statsbiblioteket.dk>
Subject RE: How large is your solr index?
Date Sat, 03 Jan 2015 21:00:12 GMT
Erick Erickson [erickerickson@gmail.com] wrote:
> Of course I wouldn't be doing the work so I really don't have much of
> a vote, but it's not clear to me at all that enough people would actually
> have a use-case for 2b+ docs in a single shard to make it
> worthwhile. At that scale GC potentially becomes really unpleasant for
> instance....

Over the last years we have seen a few use cases here on the mailing list. I would be very
surprised if the number of such cases does not keep rising. Currently the work for a complete
overhaul does not measure up to the rewards, but that is slowly changing. At the very least
I find it prudent to not limit new Lucene/Solr interfaces to ints.

As for GC: Right now a lot of structures are single-array oriented (for example using a long-array
to represent bits in a bitset), which might not work well with current garbage collectors.
A change to higher limits also means re-thinking such approaches: If the garbage collectors
likes objects below a certain size then split the arrays into that. Likewise, iterations over
structures linear in size to the index could be threaded. These are issues even with the current
2b limitation.

- Toke Eskildsen

Mime
View raw message