lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emir Arnautović <emir.arnauto...@sematext.com>
Subject Re: Data Import Handler with Solr Source behind Load Balancer
Date Fri, 14 Sep 2018 09:25:39 GMT
Hi Thomas,
Is this SolrCloud or Solr master-slave? Do you update index while indexing? Did you check
if all your instances behind LB are in sync if you are using master-slave?
My guess would be that DIH is using cursors to read data from another Solr. If you are using
multiple Solr instances behind LB there might be some diffs in index that results in different
documents being returned for the same cursor mark. Is num doc and max doc the same on new
instance after import?

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 12 Sep 2018, at 05:53, Zimmermann, Thomas <tzimmermann@techtarget.com> wrote:
> 
> We have a Solr v7 Instance sourcing data from a Data Import Handler with a Solr data
source running Solr v4. When it hits a single server in that instance directly, all documents
are read and written correctly to the v7. When we hit the load balancer DNS entry, the resulting
data import handler json states that it read all the documents and skipped none, and all looks
fine, but the result set is missing ~20% of the documents in the v7 core. This has happened
multiple time on multiple environments.
> 
> Any thoughts on whether this might be a bug in the underlying DIH code? I'll also pass
it along to the server admins on our side for input.


Mime
View raw message