lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rahul Goswami <rahul196...@gmail.com>
Subject Re: Full index replication upon service restart
Date Thu, 21 Feb 2019 16:46:06 GMT
Eric,
Thanks for the insight. We are looking at tuning the architecture. We are
also stopping the indexing application before we bring down the Solr nodes
for maintenance. However when both nodes are up, and one replica is falling
behind too much we want to throttle the requests. Is there an API in Solr
to know whether a replica is falling behind from the leader ?

Thanks,
Rahul

On Mon, Feb 11, 2019 at 10:28 PM Erick Erickson <erickerickson@gmail.com>
wrote:

> bq. To answer your question about index size on
> disk, it is 3 TB on every node. As mentioned it's a 32 GB machine and I
> allocated 24GB to Java heap.
>
> This is massively undersized in terms of RAM in my experience. You're
> trying to cram 3TB of index into 32GB of memory. Frankly, I don't think
> there's much you can do to increase stability in this situation, too many
> things are going on. In particular, you're indexing during node restart.
>
> That means that
> 1> you'll almost inevitably get a full sync on start given your update
>      rate.
> 2> while you're doing the full sync, all new updates are sent to the
>       recovering replica and put in the tlog.
> 3> When the initial replication is done, the documents sent to the
>      tlog while recovering are indexed. This is 7 hours of accumulated
>      updates.
> 4> If much goes wrong in this situation, then you're talking another full
>      sync.
> 5> rinse, repeat.
>
> There are no magic tweaks here. You really have to rethink your
> architecture. I'm actually surprised that your queries are performant.
> I expect you're getting a _lot_ of I/O, that is the relevant parts of your
> index are swapping in and out of the OS memory space. A _lot_.
> Or you're only using a _very_ small bit of your index.
>
> Sorry to be so negative, but this is not a situation that's amenable to
> a quick fix.
>
> Best,
> Erick
>
>
>
>
> On Mon, Feb 11, 2019 at 4:10 PM Rahul Goswami <rahul196452@gmail.com>
> wrote:
> >
> > Thanks for the response Eric. To answer your question about index size on
> > disk, it is 3 TB on every node. As mentioned it's a 32 GB machine and I
> > allocated 24GB to Java heap.
> >
> > Further monitoring the recovery, I see that when the follower node is
> > recovering, the leader node (which is NOT recovering) almost freezes with
> > 100% CPU usage and 80%+ memory usage. Follower node's memory usage is
> 80%+
> > but CPU is very healthy. Also Follower node's log is filled up with
> updates
> > forwarded from the leader ("...PRE_UPDATE FINISH
> > {update.distrib=FROMLEADER&distrib.from=...") and replication starts much
> > afterwards.
> > There have been instances when complete recovery took 10+ hours. We have
> > upgraded to a 4 Gbps NIC between the nodes to see if it helps.
> >
> > Also, a few followup questions:
> >
> > 1) Is  there a configuration which would start throttling update requests
> > if the replica falls behind a certain number of updates so as to not
> > trigger an index replication later?  If not, would it be a worthy
> > enhancement?
> > 2) What would be a recommended hard commit interval for this kind of
> setup
> > ?
> > 3) What are some of the improvements in 7.5 with respect to recovery as
> > compared to 7.2.1?
> > 4) What do the below peersync failure logs lines mean?  This would help
> me
> > better understand the reasons for peersync failure and maybe devise some
> > alert mechanism to start throttling update requests from application
> > program if feasible.
> >
> > *PeerSync Failure type 1*:
> > ----------------------------------
> > 2019-02-04 20:43:50.018 INFO
> > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr
> > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42
> > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45)
> > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45
> > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42]
> > org.apache.solr.update.PeerSync Fingerprint comparison: 1
> >
> > 2019-02-04 20:43:50.018 INFO
> > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr
> > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42
> > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45)
> > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45
> > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42]
> > org.apache.solr.update.PeerSync Other fingerprint:
> > {maxVersionSpecified=1624579878580912128,
> > maxVersionEncountered=1624579893816721408, maxInHash=1624579878580912128,
> > versionsHash=-8308981502886241345, numVersions=32966082,
> numDocs=32966165,
> > maxDoc=1828452}, Our fingerprint:
> {maxVersionSpecified=1624579878580912128,
> > maxVersionEncountered=1624579975760838656, maxInHash=1624579878580912128,
> > versionsHash=4017509388564167234, numVersions=32966066, numDocs=32966165,
> > maxDoc=1828452}
> >
> > 2019-02-04 20:43:50.018 INFO
> > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr
> > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42
> > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45)
> > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45
> > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42]
> > org.apache.solr.update.PeerSync PeerSync:
> > core=DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42
> url=
> > http://indexnode1:8983/solr DONE. sync failed
> >
> > 2019-02-04 20:43:50.018 INFO
> > (recoveryExecutor-4-thread-2-processing-n:indexnode1:8983_solr
> > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42
> > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45)
> > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45
> > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42]
> > org.apache.solr.cloud.RecoveryStrategy PeerSync Recovery was not
> successful
> > - trying replication.
> >
> >
> > *PeerSync Failure type 1*:
> > ---------------------------------
> > 2019-02-02 20:26:56.256 WARN
> > (recoveryExecutor-4-thread-11-processing-n:indexnode1:20000_solr
> > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46
> > s:shard12 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node49)
> > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard12 r:core_node49
> > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46]
> > org.apache.solr.update.PeerSync PeerSync:
> > core=DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46
> url=
> > http://indexnode1:20000/solr too many updates received since start -
> > startingUpdates no longer overlaps with our currentUpdates
> >
> >
> > Regards,
> > Rahul
> >
> > On Thu, Feb 7, 2019 at 12:59 PM Erick Erickson <erickerickson@gmail.com>
> > wrote:
> >
> > > bq. We have a heavy indexing load of about 10,000 documents every 150
> > > seconds.
> > > Not so heavy query load.
> > >
> > > It's unlikely that changing numRecordsToKeep will help all that much if
> > > your
> > > maintenance window is very large. Rather, that number would have to be
> > > _very_
> > > high.
> > >
> > > 7 hours is huge. How big are your indexes on disk? You're essentially
> > > going to get a
> > > full copy from the leader for each replica, so network bandwidth may
> > > be the bottleneck.
> > > Plus, every doc that gets indexed to the leader during sync will be
> stored
> > > away in the replica's tlog (not limited by numRecordsToKeep) and
> replayed
> > > after
> > > the full index replication is accomplished.
> > >
> > > Much of the retry logic for replication has been improved starting
> > > with Solr 7.3 and,
> > > in particular, Solr 7.5. That might address your replicas that just
> > > fail to replicate ever,
> > > but won't help that replicas need to full sync anyway.
> > >
> > > That said, by far the simplest thing would be to stop indexing during
> > > your maintenance
> > > window if at all possible.
> > >
> > > Best,
> > > Erick
> > >
> > > On Tue, Feb 5, 2019 at 9:11 PM Rahul Goswami <rahul196452@gmail.com>
> > > wrote:
> > > >
> > > > Hello Solr gurus,
> > > >
> > > > So I have a scenario where on Solr cluster restart the replica node
> goes
> > > > into full index replication for about 7 hours. Both replica nodes are
> > > > restarted around the same time for maintenance. Also, during usual
> times,
> > > > if one node goes down for whatever reason, upon restart it again does
> > > index
> > > > replication. In certain instances, some replicas just fail to
> recover.
> > > >
> > > > *SolrCloud 7.2.1 *cluster configuration*:*
> > > > ============================
> > > > 16 shards - replication factor=2
> > > >
> > > > Per server configuration:
> > > > ======================
> > > > 32GB machine - 16GB heap space for Solr
> > > > Index size : 3TB per server
> > > >
> > > > autoCommit (openSearcher=false) of 3 minutes
> > > >
> > > > We have a heavy indexing load of about 10,000 documents every 150
> > > seconds.
> > > > Not so heavy query load.
> > > >
> > > > Reading through some of the threads on similar topic, I suspect it
> would
> > > be
> > > > the disparity between the number of updates(>100) between the
> replicas
> > > that
> > > > is causing this (courtesy our indexing load). One of the suggestions
> I
> > > saw
> > > > was using numRecordsToKeep.
> > > > However as Erick mentioned in one of the threads, that's a bandaid
> > > measure
> > > > and I am trying to eliminate some of the fundamental issues that
> might
> > > > exist.
> > > >
> > > > 1) Is the heap too less for that index size? If yes, what would be a
> > > > recommended max heap size?
> > > > 2) Is there a general guideline to estimate the required max heap
> based
> > > on
> > > > index size on disk?
> > > > 3) What would be a recommended autoCommit and autoSoftCommit
> interval ?
> > > > 4) Any configurations that would help improve the restart time and
> avoid
> > > > full replication?
> > > > 5) Does Solr retain "numRecordsToKeep" number of  documents in tlog
> *per
> > > > replica*?
> > > > 6) The reasons for peersync from below logs are not completely clear
> to
> > > me.
> > > > Can someone please elaborate?
> > > >
> > > > *PeerSync fails with* :
> > > >
> > > > Failure type 1:
> > > > -----------------
> > > > 2019-02-04 20:43:50.018 INFO
> > > > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr
> > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42
> > > > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66
> r:core_node45)
> > > > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11
> r:core_node45
> > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42]
> > > > org.apache.solr.update.PeerSync Fingerprint comparison: 1
> > > >
> > > > 2019-02-04 20:43:50.018 INFO
> > > > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr
> > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42
> > > > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66
> r:core_node45)
> > > > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11
> r:core_node45
> > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42]
> > > > org.apache.solr.update.PeerSync Other fingerprint:
> > > > {maxVersionSpecified=1624579878580912128,
> > > > maxVersionEncountered=1624579893816721408,
> maxInHash=1624579878580912128,
> > > > versionsHash=-8308981502886241345, numVersions=32966082,
> > > numDocs=32966165,
> > > > maxDoc=1828452}, Our fingerprint:
> > > {maxVersionSpecified=1624579878580912128,
> > > > maxVersionEncountered=1624579975760838656,
> maxInHash=1624579878580912128,
> > > > versionsHash=4017509388564167234, numVersions=32966066,
> numDocs=32966165,
> > > > maxDoc=1828452}
> > > >
> > > > 2019-02-04 20:43:50.018 INFO
> > > > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr
> > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42
> > > > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66
> r:core_node45)
> > > > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11
> r:core_node45
> > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42]
> > > > org.apache.solr.update.PeerSync PeerSync:
> > > >
> core=DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42
> > > url=
> > > > http://indexnode1:8983/solr DONE. sync failed
> > > >
> > > > 2019-02-04 20:43:50.018 INFO
> > > > (recoveryExecutor-4-thread-2-processing-n:indexnode1:8983_solr
> > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42
> > > > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66
> r:core_node45)
> > > > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11
> r:core_node45
> > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42]
> > > > org.apache.solr.cloud.RecoveryStrategy PeerSync Recovery was not
> > > successful
> > > > - trying replication.
> > > >
> > > >
> > > > Failure type 2:
> > > > ------------------
> > > > 2019-02-02 20:26:56.256 WARN
> > > > (recoveryExecutor-4-thread-11-processing-n:indexnode1:20000_solr
> > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46
> > > > s:shard12 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66
> r:core_node49)
> > > > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard12
> r:core_node49
> > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46]
> > > > org.apache.solr.update.PeerSync PeerSync:
> > > >
> core=DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46
> > > url=
> > > > http://indexnode1:20000/solr too many updates received since start -
> > > > startingUpdates no longer overlaps with our currentUpdates
> > > >
> > > >
> > > > Thanks,
> > > > Rahul
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message