lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Darrell Burgan <>
Subject SolrCloud multiple data center support
Date Mon, 03 Feb 2014 16:48:20 GMT
Hello, we are using Solr in a SolrCloud configuration, with two Solr instances running with
three Zookeepers in a single data center. We presently have a single search index with about
35 million entries in it, about 60GB disk space on each of the two Solr servers (120GB total).
I would expect our usage of Solr to grow to include other search indexes, and likely larger
data volumes.

I'm writing because we're needing to grow beyond a single data center, with two (potentially
incompatible) goals:

1.       We need to be able to have a hot disaster recovery site, in a completely separate
data center, that has a near-realtime replica of the search index.

2.       We'd like to have the option to have multiple active/active data centers that each
see and update the same search index, distributed across data centers.

The options I'm aware of from reading archives:

a.       Simply set up the remote Solr instances as active parts of the same SolrCloud cluster.
This will  essentially involve us standing up multiple Zookeepers in the second data center,
and multiple Solr instances, and they will all keep each other in sync magically. This will
also solve both of our goals. However, I'm concerned about performance and whether SolrCloud
is smart enough to route local search queries only to local Solr servers ... ? Also, how does
such a cluster tolerate and recover from network partitions?

b.      The remote Solr instances form their own completely unrelated SolrCloud cluster. I
have to invent some kind of replication logic of my own to sync data between them. This replication
would have to be bidirectional to satisfy both of our goals. I strongly dislike this option
since the application really should not concern itself with data distribution. But I'll do
it if I must.

So my questions are:

-          Can anyone give me any guidance as to option a? Anyone using this in a real production
setting? Words of wisdom? Does it work?

-          Are there any other options that I'm not considering?

-          What is Solr's answer to such configurations (we can't be alone in needing one)?
Any big enhancements coming on the Solr road map to deal with this?

Darrell Burgan

[Description: Infor]<>

Darrell Burgan | Chief Architect, PeopleAnswers
office: 214 445 2172 | mobile: 214 564 4450 | fax: 972 692 5386 |<>

CONFIDENTIALITY NOTE: This email (including any attachments) is confidential and may be protected
by legal privilege. If you are not the intended recipient, be aware that any disclosure, copying,
distribution, or use of the information contained herein is prohibited.  If you have received
this message in error, please notify the sender by replying to this message and then delete
this message in its entirety. Thank you for your cooperation.

  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message