lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Glen Newton <glen.new...@gmail.com>
Subject Re: docid is just a signed int32
Date Thu, 18 Aug 2016 14:03:48 GMT
Or maybe it is time Lucene re-examined this limit.

There are use cases out there where >2^31 does make sense in a single index
(huge number of tiny docs).

Also, I think the underlying hardware and the JDK have advanced to make
this more defendable.

Constructively,
Glen


On Thu, Aug 18, 2016 at 9:55 AM, Adrien Grand <jpountz@gmail.com> wrote:

> No, IndexWriter enforces that the number of documents cannot go over
> IndexWriter.MAX_DOCS (which is a bit less than 2^31) and
> BaseCompositeReader computes the number of documents in a long variable and
> ensures it is less than 2^31, so you cannot have indexes that contain more
> than 2^31 documents.
>
> Larger collections should be written to multiple shards and use
> TopDocs.merge to merge results.
>
> Le jeu. 18 août 2016 à 15:38, Cristian Lorenzetto <
> cristian.lorenzetto@gmail.com> a écrit :
>
> > docid is a signed int32 so it is not so big, but really docid seams not a
> > primary key unmodifiable but a temporary id for the view related to a
> > specific search.
> >
> > So repository can contains more than 2^31 documents.
> >
> > My deduction is correct ? is there a maximum size for lucene index?
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message