lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ganesh Sethuraman <ganeshmail...@gmail.com>
Subject how to get high-availability for Solr csv update handler?
Date Mon, 25 Feb 2019 18:15:42 GMT
Hi

We are using Solr Cloud 7.2.1. We are using Solr CSV update handler to do
bulk update (several Millions of docs) in to multiple collections. When we
make a call to the CSV update handler using curl command line (as below),
we are pointing to single server in Solr. During the problem time, when one
of the Solr server goes down this approach could fail. Is there any way
that we do this to send the write to the leader, like how the solrj does,
through the simple curl command(s) line?

In the request below for some reason, if the SOLR1-SERVER is down, the
request will fail, even though the new leader say SOLR2-SERVER is up.

curl 'http://<<SOLR1-SERVER>>:8983/solr/my_collection/update?commit=true'
--data-binary @example/exampledocs/books.csv -H
'Content-type:application/csv'

1. I can create load balancer / ALB infront of solr, but that may not still
identify the Leader for efficiency.
2. I can write a solrj client to update, but i am not sure if i will get
the efficiency of  bulk update? not sure about the simplicity of the curl
as well.

Any best practices for the same would be good to have.

Regards
Ganesh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message