lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Burton-West, Tom" <tburt...@umich.edu>
Subject Experience with indexing billions of documents?
Date Fri, 02 Apr 2010 15:57:25 GMT
We are currently indexing 5 million books in Solr, scaling up over the next few years to 20
million.  However we are using the entire book as a Solr document.  We are evaluating the
possibility of indexing individual pages as there are some use cases where users want the
most relevant pages regardless of what book they occur in.  However, we estimate that we are
talking about somewhere between 1 and 6 billion pages and have concerns over whether Solr
will scale to this level.

Does anyone have experience using Solr with 1-6 billion Solr documents?

The lucene file format document (http://lucene.apache.org/java/3_0_1/fileformats.html#Limitations)
 mentions a limit of about 2 billion document ids.   I assume this is the lucene internal
document id and would therefore be a per index/per shard limit.  Is this correct?


Tom Burton-West.




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message