lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen ...@statsbiblioteket.dk>
Subject RE: Need Debug Direction on Performance Problem
Date Sun, 18 Jan 2015 10:49:41 GMT
Naresh Yadav [nyadav.ait@gmail.com] wrote:
> In both setups, we are reading in batches of 50k and each batch taking
> Setup1  : approx 7 seconds and for completing all batches of total 10 lakh
> results takes 1 to 2 minutes.
> Setup2 : approx 2-3 minutes and for completing all batches of total 10 lakh
> results  takes 114 minutes.

Deep paging across shards without cursors means that for each request, the full result set
up to that point must be requested from each shard. The deeper your page, the longer it takes
for each request. If you only extracted 500K results instead of the 1M in setup 2, it would
likely take a lot less than 114/2 minutes.

Since you are exporting the full result set, you should be using a cursor:
https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
This should make your extraction linear to the number of documents and hopefully a lot faster
than your current setup.

Also, please refrain from using regional units such as "lakh" in an international forum. It
requires some readers (me for example) to perform a search in order to be sure on what you
are talking about.

- Toke Eskildsen

Mime
View raw message