lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hendrik Haddorp <hendrik.hadd...@gmx.net>
Subject Re: Recovery Issue - Solr 6.6.1 and HDFS
Date Tue, 21 Nov 2017 19:27:30 GMT
Unfortunately I can not upload my cleanup code but the steps I'm doing 
are quite easy. I wrote it in Java using the HDFS API and Curator for 
ZooKeeper. Steps are:
     - read out the children of /collections in ZK so you know all the 
collection names
     - read /collections/<collection name>/state.json to get the state
     - find the replicas in the state and filter those out that have a 
"node_name" matching your locale node (the node name is basically a 
combination of your host name and the solr port)
     - if the replica data has "dataDir" set then you basically only 
need to add "index/write.lock" to it and you have the lock location
     - if "dataDir" is not set (not really sure why) then you need to 
construct it yourself: <hdfs base path>/<collection name>/<replica 
name>/data/index/write.lock
     - if the lock file exist delete it

I believe there is a small race condition in case you use replica auto 
fail over. So I try to keep the time between checking the state in 
ZooKeeper and deleting the lock file as short, like not first determine 
all lock file locations and only then delete them but do that while 
checking the state.

regards,
Hendrik

On 21.11.2017 19:53, Joe Obernberger wrote:
> A clever idea.  Normally what we do when we need to do a restart, is 
> to halt indexing, and then wait about 30 minutes.  If we do not wait, 
> and stop the cluster, the default scripts 180 second timeout is not 
> enough and we'll have lock files to clean up.  We use puppet to start 
> and stop the nodes, but at this point that is not working well since 
> we need to start one node at a time.  With each one taking hours, this 
> is a lengthy process!  I'd love to see your script!
>
> This new error is now coming up - see screen shot.  For some reason 
> some of the shards have no leader assigned:
>
> http://lovehorsepower.com/SolrClusterErrors.jpg
>
> -Joe
>
>
> On 11/21/2017 1:34 PM, Hendrik Haddorp wrote:
>> Hi,
>>
>> the write.lock issue I see as well when Solr is not been stopped 
>> gracefully. The write.lock files are then left in the HDFS as they do 
>> not get removed automatically when the client disconnects like a 
>> ephemeral node in ZooKeeper. Unfortunately Solr does also not realize 
>> that it should be owning the lock as it is marked in the state stored 
>> in ZooKeeper as the owner and is also not willing to retry, which is 
>> why you need to restart the whole Solr instance after the cleanup. I 
>> added some logic to my Solr start up script which scans the log files 
>> in HDFS and compares that with the state in ZooKeeper and then delete 
>> all lock files that belong to the node that I'm starting.
>>
>> regards,
>> Hendrik
>>
>> On 21.11.2017 14:07, Joe Obernberger wrote:
>>> Hi All - we have a system with 45 physical boxes running solr 6.6.1 
>>> using HDFS as the index. The current index size is about 31TBytes. 
>>> With 3x replication that takes up 93TBytes of disk. Our main 
>>> collection is split across 100 shards with 3 replicas each.  The 
>>> issue that we're running into is when restarting the solr6 cluster.  
>>> The shards go into recovery and start to utilize nearly all of their 
>>> network interfaces.  If we start too many of the nodes at once, the 
>>> shards will go into a recovery, fail, and retry loop and never come 
>>> up.  The errors are related to HDFS not responding fast enough and 
>>> warnings from the DFSClient.  If we stop a node when this is 
>>> happening, the script will force a stop (180 second timeout) and 
>>> upon restart, we have lock files (write.lock) inside of HDFS.
>>>
>>> The process at this point is to start one node, find out the lock 
>>> files, wait for it to come up completely (hours), stop it, delete 
>>> the write.lock files, and restart.  Usually this second restart is 
>>> faster, but it still can take 20-60 minutes.
>>>
>>> The smaller indexes recover much faster (less than 5 minutes). 
>>> Should we have not used so many replicas with HDFS?  Is there a 
>>> better way we should have built the solr6 cluster?
>>>
>>> Thank you for any insight!
>>>
>>> -Joe
>>>
>>
>>
>> ---
>> This email has been checked for viruses by AVG.
>> http://www.avg.com
>>
>


Mime
View raw message