lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum Gupta <>
Subject Re: Fastest way to import big amount of documents in SolrCloud
Date Thu, 01 May 2014 20:57:51 GMT
Hi Costi,

I'd recommend SolrJ, parallelize the inserts. Also, it helps to set the
commit intervals reasonable.

Just to get a better perspective
* Why do you want to do a full index everyday?
* How much of data are we talking about?
* What's your SolrCloud setup like?
* Do you already have some benchmarks which you're not happy with?

On Thu, May 1, 2014 at 1:47 PM, Costi Muraru <> wrote:

> Hi guys,
> What would you say it's the fastest way to import data in SolrCloud?
> Our use case: each day do a single import of a big number of documents.
> Should we use SolrJ/DataImportHandler/other? Or perhaps is there a bulk
> import feature in SOLR? I came upon this promising link:
> Any idea on how UpdateCSV is performance-wise compared with
> SolrJ/DataImportHandler?
> If SolrJ, should we split the data in chunks and start multiple clients at
> once? In this way we could perhaps take advantage of the multitude number
> of servers in the SolrCloud configuration?
> Either way, after the import is finished, should we do an optimize or a
> commit or none (
> Any tips and tricks to perform this process the right way are gladly
> appreciated.
> Thanks,
> Costi


Anshum Gupta

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message