lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Himanshu Sachdeva <himan...@limeroad.com>
Subject Re: Upgrading cluster from 4 to 5. Slow replication detected.
Date Wed, 19 Apr 2017 06:56:47 GMT
I am guessing that the index has got corrupted somehow and deleted the data
directory on the slave. It has started copying the index. I'll report here
once that gets completed. If there is any other suggestion you might have
please reply back in the meantime. Thanks.

On Wed, Apr 19, 2017 at 12:21 PM, Himanshu Sachdeva <himanshu@limeroad.com>
wrote:

> Hello Shawn,
>
> Thanks for taking the time out to help me. I had assigned 45GB to the heap
> as starting memory and maximum memory it can use. The logs show the
> following two warnings repeatedly :
>
>    - IndexFetcher : Cannot complete replication attempt because file
>    already exists.
>    - IndexFetcher : Replication attempt was not successful - trying a
>    full index replication reloadCore=false.
>
>
>
> On Tue, Apr 18, 2017 at 6:58 PM, Shawn Heisey <apache@elyograg.org> wrote:
>
>> On 4/14/2017 2:10 AM, Himanshu Sachdeva wrote:
>> > We're starting to upgrade our solr cluster to version 5.5. So we
>> > removed one slave node from the cluster and installed solr 5.5.4 on it
>> > and started solr. So it started copying the index from the master.
>> > However, we noticed a drop in the replication speed compared to the
>> > other nodes which were still running solr 4. To do a fair comparison,
>> > I removed another slave node from the cluster and disabled replication
>> > on it till the new node has caught up with it. When both these nodes
>> > were at the same index generation, I turned replication on for both
>> > the nodes. Now, it has been over 15 hours since this exercise and the
>> > new node has again started lagging behind. Currently, the node with
>> > solr 5.5 is seven generations behind the other node.
>>
>> Version 5 is capable of replication bandwidth throttling, but unless you
>> actually configure the maxWriteMBPerSec attribute in the replication
>> handler definition, this should not happen by default.
>>
>> One problem that I think might be possible is that the heap has been
>> left at the default 512MB on the new 5.5.4 install and therefore the
>> machine is doing constant full garbage collections to free up memory for
>> normal operation, which would make Solr run EXTREMELY slowly.
>> Eventually a machine in this state would most likely encounter an
>> OutOfMemoryError.  On non-windows systems, OOME will cause a forced halt
>> of the entire Solr instance.
>>
>> The heap might not be the problem ... if it's not, then I do not know
>> what is going on.  Are there any errors or warnings in solr.log?
>>
>> Thanks,
>> Shawn
>>
>>
>
>
> --
> Himanshu Sachdeva
>
>


-- 
Himanshu Sachdeva

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message