lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Białecki ...@getopt.org>
Subject Re: Solr 7.7.2 - SolrCloud - SPLITSHARD - Using LINK method fails on disk usage checks
Date Wed, 19 Jun 2019 12:07:02 GMT
Hi Andrew,

Please create a JIRA issue and attach this patch, I’ll look into fixing this. Thanks!


> On 18 Jun 2019, at 23:19, Andrew Kettmann <andrew.kettmann@evolve24.com> wrote:
> 
> Attached the patch, but that isn't sent out on the mailing list, my mistake. Patch below:
> 
> 
> 
> ### START
> 
> diff --git a/solr/core/src/java/org/apache/solr/cloud/api/collections/SplitShardCmd.java
b/solr/core/src/java/org/apache/solr/cloud/api/collections/SplitShardCmd.java
> index 24a52eaf97..e018f8a42f 100644
> --- a/solr/core/src/java/org/apache/solr/cloud/api/collections/SplitShardCmd.java
> +++ b/solr/core/src/java/org/apache/solr/cloud/api/collections/SplitShardCmd.java
> @@ -135,7 +135,9 @@ public class SplitShardCmd implements OverseerCollectionMessageHandler.Cmd
{
>     }
> 
>     RTimerTree t = timings.sub("checkDiskSpace");
> -    checkDiskSpace(collectionName, slice.get(), parentShardLeader);
> +    if (splitMethod != SolrIndexSplitter.SplitMethod.LINK) {
> +      checkDiskSpace(collectionName, slice.get(), parentShardLeader);
> +    }
>     t.stop();
> 
>     // let's record the ephemeralOwner of the parent leader node
> 
> ### END
> 
> ________________________________
> From: Andrew Kettmann
> Sent: Tuesday, June 18, 2019 3:05:15 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 7.7.2 - SolrCloud - SPLITSHARD - Using LINK method fails on disk usage
checks
> 
> 
> Looks like the disk check here is the problem, I am no Java developer, but this patch
ignores the check if you are using the link method for splitting. Attached the patch. This
is off of the commit for 7.7.2, d4c30fc285 . The modified version only has to be run on the
overseer machine, so there is that at least.
> 
> ________________________________
> From: Andrew Kettmann
> Sent: Tuesday, June 18, 2019 11:32:43 AM
> To: solr-user@lucene.apache.org
> Subject: Solr 7.7.2 - SolrCloud - SPLITSHARD - Using LINK method fails on disk usage
checks
> 
> 
> Using Solr 7.7.2 Docker image, testing some of the new autoscale features, huge fan so
far. Tested with the link method on a 2GB core and found that it took less than 1MB of additional
space. Filled the core quite a bit larger, 12GB of a 20GB PVC, and now splitting the shard
fails with the following error message on my overseer:
> 
> 
> 2019-06-18 16:27:41.754 ERROR (OverseerThreadFactory-49-thread-5-processing-n:10.0.192.74:8983_solr)
[c:test_autoscale s:shard1  ] o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: test_autoscale
operation: splitshard failed:org.apache.solr.common.SolrException: not enough free disk space
to perform index split on node 10.0.193.23:8983_solr, required: 23.35038321465254, available:
7.811378479003906
>    at org.apache.solr.cloud.api.collections.SplitShardCmd.checkDiskSpace(SplitShardCmd.java:567)
>    at org.apache.solr.cloud.api.collections.SplitShardCmd.split(SplitShardCmd.java:138)
>    at org.apache.solr.cloud.api.collections.SplitShardCmd.call(SplitShardCmd.java:94)
>    at org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:294)
>    at org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:505)
>    at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
>    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>    at java.base/java.lang.Thread.run(Thread.java:834)
> 
> 
> 
> I attempted sending the request to the node itself to see if it did anything different,
but no luck. My parameters are (Note Python formatting as that is my language of choice):
> 
> 
> 
> splitparams = {'action':'SPLITSHARD',
>               'collection':'test_autoscale',
>               'shard':'shard1',
>               'splitMethod':'link',
>               'timing':'true',
>               'async':'shardsplitasync'}
> 
> 
> And this is confirmed by the log message from the node itself:
> 
> 
> 2019-06-18 16:27:41.730 INFO  (qtp1107530534-16) [c:test_autoscale   ] o.a.s.s.HttpSolrCall
[admin] webapp=null path=/admin/collections params={async=shardsplitasync&timing=true&action=SPLITSHARD&collection=test_autoscale&shard=shard1&splitMethod=link}
status=0 QTime=20
> 
> 
> While it is true I do not have enough space if I were using the rewrite method, the link
method on a 2GB core used an additional less than 1MB of space. Is there something I am missing
here? is there an option to disable the disk space check that I need to pass? I can't find
anything in the documentation at this point.
> 
> 
> [https://storage.googleapis.com/e24-email-images/e24logonotag.png]<https://www.evolve24.com>
> Andrew Kettmann
> DevOps Engineer
> P: 1.314.596.2836
> [LinkedIn]<https://linkedin.com/company/evolve24> [Twitter] <https://twitter.com/evolve24>
 [Instagram] <https://www.instagram.com/evolve_24>
> 
> evolve24 Confidential & Proprietary Statement: This email and any attachments are
confidential and may contain information that is privileged, confidential or exempt from disclosure
under applicable law. It is intended for the use of the recipients. If you are not the intended
recipient, or believe that you have received this communication in error, please do not read,
print, copy, retransmit, disseminate, or otherwise use the information. Please delete this
email and attachments, without reading, printing, copying, forwarding or saving them, and
notify the Sender immediately by reply email. No confidentiality or privilege is waived or
lost by any transmission in error.


Mime
View raw message