lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: How large is your solr index?
Date Sat, 03 Jan 2015 21:41:10 GMT
I can't disagree. You bring up some of the points that make me _extremely_
reluctant to try to get this in to 5.x though. 6.0 at the earliest I should
think.

And who knows? Java may get a GC process that's geared to modern
amounts of memory and get by the current pain....

Best,
Erick

On Sat, Jan 3, 2015 at 1:00 PM, Toke Eskildsen <te@statsbiblioteket.dk>
wrote:

> Erick Erickson [erickerickson@gmail.com] wrote:
> > Of course I wouldn't be doing the work so I really don't have much of
> > a vote, but it's not clear to me at all that enough people would actually
> > have a use-case for 2b+ docs in a single shard to make it
> > worthwhile. At that scale GC potentially becomes really unpleasant for
> > instance....
>
> Over the last years we have seen a few use cases here on the mailing list.
> I would be very surprised if the number of such cases does not keep rising.
> Currently the work for a complete overhaul does not measure up to the
> rewards, but that is slowly changing. At the very least I find it prudent
> to not limit new Lucene/Solr interfaces to ints.
>
> As for GC: Right now a lot of structures are single-array oriented (for
> example using a long-array to represent bits in a bitset), which might not
> work well with current garbage collectors. A change to higher limits also
> means re-thinking such approaches: If the garbage collectors likes objects
> below a certain size then split the arrays into that. Likewise, iterations
> over structures linear in size to the index could be threaded. These are
> issues even with the current 2b limitation.
>
> - Toke Eskildsen
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message