lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Request routing / load-balancing TLOG & PULL replica types
Date Mon, 12 Feb 2018 03:03:09 GMT
Talking a little out of my depth here as I haven't ben in that code,
so if there are corrections they're welcome.

In general, there's nothing special between TLOG and PULL replicas in
terms of query routing. For
that matter, nothing special about either of these .vs. NRT replicas
for _queries_.

SolrCloud has internal load balancers that route queries across the
network. SolrCloud automatically
bypasses all  non-active replicas. So there's no particular need to
use a fronting load balancer. That
said, it depends on how you are accessing SolrCloud. By that I mean if
you provide a single HTTP end
point to a single node, you have a single point of failure. Even this
is irrelevant if you use SolrJ because
it has (you guessed it) an internal load balancer and is
ZooKeeper-aware so can handle nodes coming and
going.

So look at it this way. Each Solr node has a list of active replicas
and directs queries (or sub-queries) to those
replicas. It doesn't matter _where_ the replica is or, indeed, whether
any of them are on the same node. As
long as Solr is running, it sends queries to the right place.

"will it automatically favour PULL replicas"
No. There is so little extra work in a TLOG .vs. a PULL replica that
this is minimally useful. The extra work
a TLOG replica does is just to flush out the incoming documents to the
tlog, a bit of I/O. Much more
interesting is that the new metrics are in place to allow much more
intelligent use of resources. A heavy
request on a replica will put _much_ more load on that machine than
handling the TLOG. The metrics
will allow SolrCloud to say "replicas 1, 3, 5 are pretty busy, let's
not use them for more work until they
get less busy", and it won't matter whether they're TLOG, PULL or NRT replicas..

Best,
Erick

On Sun, Feb 11, 2018 at 6:35 PM, Greg Roodt <groodt@gmail.com> wrote:
> Hi
>
> I have a question around how queries are routed and load-balanced in a
> cluster of mixed TLOG and PULL replicas.
>
> I thought that I might have to put a load-balancer in front of the PULL
> replicas and direct queries at them manually as nodes are added and removed
> as PULL replicas. However, it seems that SolrCloud handles this
> automatically?
>
> If I add a new PULL replica node, it goes into state="recovering" while it
> pulls the core. As expected. What happens if queries are directed at this
> node while in this state? From what I am observing, the query gets directed
> to another node?
>
> If SolrCloud is handling the routing of requests to active nodes, will it
> automatically favour PULL replicas for read queries and TLOG replicas for
> writes?
>
> Thanks
> Greg

Mime
View raw message