lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Blargy <>
Subject Importing large datasets
Date Wed, 02 Jun 2010 01:54:36 GMT

We have around 5 million items in our index and each item has a description
located on a separate physical database. These item descriptions vary in
size and for the most part are quite large. Currently we are only indexing
items and not their corresponding description and a full import takes around
4 hours. Ideally we want to index both our items and their descriptions but
after some quick profiling I determined that a full import would take in
excess of 24 hours. 

- How would I profile the indexing process to determine if the bottleneck is
Solr or our Database.
- In either case, how would one speed up this process? Is there a way to run
parallel import processes and then merge them together at the end? Possibly
use some sort of distributed computing?

Any ideas. Thanks
View this message in context:
Sent from the Solr - User mailing list archive at

View raw message