lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Obernberger <>
Subject Re: Recovery Issue - Solr 6.6.1 and HDFS
Date Mon, 27 Nov 2017 20:15:06 GMT
Just to add onto this.  Right now the cluster has recovered, and life is 
good.  My concern with a cluster restart are, lock files, and network 
timeouts on startup.  The 1st can be addressed by stopping indexing, 
waiting until things flush out, and then halting all the nodes.  No lock 

The 2nd is the one I'm scared about.  We use puppet to start/stop all 
the 45 nodes in the cluster, and on startup there is a massive amount of 
HDFS activity, that I'm afraid will put some of the replicas into 
recovery.  If that happens, then we're probably in for the recovery, 
failed, retry loop.  Anyone else run into this?



On 11/27/2017 11:28 AM, Joe Obernberger wrote:
> Thank you Erick.  Right now, we have our autoCommit time set to 
> 1800000 (30 minutes), and our autoSoftCommit set to 120000.  The 
> thought was that with HDFS we want less frequent, but larger 
> operations, since HDFS has such a large block size.  Is that incorrect 
> thinking?
> As to why we are using HDFS.  For our use case, we already have a 
> large cluster that runs HBase, and we want to index data within it.  
> Adding another layer of storage that we would need to manage would add 
> complexity.  With HDFS, we just add another box that has disk, and 
> boom - more storage for all players involved.
> -Joe
> On 11/22/2017 8:17 PM, Erick Erickson wrote:
>> Hmm. This is quite possible. Any time things take "too long" it can be
>>   a problem. For instance, if the leader sends docs to a replica and
>> the request times out, the leader throws the follower into "Leader
>> Initiated Recovery". The smoking gun here is that there are no errors
>> on the follower, just the notification that the leader put it into
>> recovery.
>> There are other variations on the theme, it all boils down to when
>> communications fall apart replicas go into recovery.....
>> Best,
>> Erick
>> On Wed, Nov 22, 2017 at 11:02 AM, Joe Obernberger
>> <> wrote:
>>> Hi Shawn - thank you for your reply. The index is 29.9TBytes as 
>>> reported
>>> by:
>>> hadoop fs -du -s -h /solr6.6.0
>>> 29.9 T  89.9 T  /solr6.6.0
>>> The 89.9TBytes is due to HDFS having 3x replication.  There are 
>>> about 1.1
>>> billion documents indexed and we index about 2.5 million documents 
>>> per day.
>>> Assuming an even distribution, each node is handling about 680GBytes of
>>> index.  So our cache size is 1.4%. Perhaps 'relatively small block 
>>> cache'
>>> was an understatement! This is why we split the largest collection 
>>> into two,
>>> where one is data going back 30 days, and the other is all the 
>>> data.  Most
>>> of our searches are not longer than 30 days back.  The 30 day index is
>>> 2.6TBytes total.  I don't know how the HDFS block cache splits between
>>> collections, but the 30 day index performs acceptable for our specific
>>> application.
>>> If we wanted to cache 50% of the index, each of our 45 nodes would 
>>> need a
>>> block cache of about 350GBytes.  I'm accepting offers of DIMMs!
>>> What I believe caused our 'recovery, fail, retry loop' was one of our
>>> servers died.  This caused HDFS to start to replicate blocks across the
>>> cluster and produced a lot of network activity.  When this happened, I
>>> believe there was high network contention for specific nodes in the 
>>> cluster
>>> and their network interfaces became pegged and requests for HDFS blocks
>>> timed out.  When that happened, SolrCloud went into recovery which 
>>> caused
>>> more network traffic.  Fun stuff.
>>> -Joe
>>> On 11/22/2017 11:44 AM, Shawn Heisey wrote:
>>>> On 11/22/2017 6:44 AM, Joe Obernberger wrote:
>>>>> Right now, we have a relatively small block cache due to the
>>>>> requirements that the servers run other software.  We tried to find
>>>>> the best balance between block cache size, and RAM for programs, 
>>>>> while
>>>>> still giving enough for local FS cache.  This came out to be 84 128M
>>>>> blocks - or about 10G for the cache per node (45 nodes total).
>>>> How much data is being handled on a server with 10GB allocated for
>>>> caching HDFS data?
>>>> The first message in this thread says the index size is 31TB, which is
>>>> *enormous*.  You have also said that the index takes 93TB of disk
>>>> space.  If the data is distributed somewhat evenly, then the answer to
>>>> my question would be that each of those 45 Solr servers would be
>>>> handling over 2TB of data.  A 10GB cache is *nothing* compared to 2TB.
>>>> When index data that Solr needs to access for an operation is not 
>>>> in the
>>>> cache and Solr must actually wait for disk and/or network I/O, the
>>>> resulting performance usually isn't very good.  In most cases you 
>>>> don't
>>>> need to have enough memory to fully cache the index data ... but less
>>>> than half a percent is not going to be enough.
>>>> Thanks,
>>>> Shawn
>>>> ---
>>>> This email has been checked for viruses by AVG.

View raw message