lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geert-Jan Brits <>
Subject Re: indexing best practices
Date Sun, 18 Jul 2010 14:06:36 GMT
Have you read:

To be short there are only guidelines (see links) no definitive answers.
If you followed the guidelines for improviing indexing speed on a single box
and after having tested various settings indexing is still too slow, you may
want to test the scenario:
1. indexing to several boxes/shards (using round robin or something).
2. copy all created indexes to one box.
3. use indexwriter.addIndexes to merge the indexes.

1/2/3 done on ssd's is of course going to boost performance a lot as well
(on large indexes, bc small ones may fit in disk cache entirely)
Hope that helps a bit,

2010/7/18 kenf_nc <>

> No one has done performance analysis? Or has a link to anywhere where it's
> been done?
> basically fastest way to get documents into Solr. So many options
> available,
> what's the fastest:
> 1) file import (xml, csv)  vs  DIH  vs POSTing
> 2) number of concurrent clients   1   vs 10 vs 100 there a
> diminishing
> returns number?
> I have 16 million small (8 to 10 fields, no large text fields) docs that
> get
> updated monthly and 2.5 million largish (20 to 30 fields, a couple html
> text
> fields) that get updated monthly. It currently takes about 20 hours to do a
> full import. I would like to cut that down as much as possible.
> Thanks,
> Ken
> --
> View this message in context:
> Sent from the Solr - User mailing list archive at

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message