lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Molloy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-8393) Component for Solr resource usage planning
Date Mon, 21 Mar 2016 14:18:25 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204298#comment-15204298
] 

Steve Molloy commented on SOLR-8393:
------------------------------------

h1. Sizing Component

The Solr SizeComponent is intended to compute resource usage information for a given Solr
core. It will perform those computations based on current index schema, Solr configuration
and document indexed in the core. It is not meant to be distributable, see the cluster sizing
action of the collection admin API for more information about sizing distributed collections.

h2. Configuration

The SizeComponent, like any search component except for the base ones, must be defined in
the solrconfig.xml file before it can be used. This is done in 2 parts.

1- Declare the component:

    <searchComponent name="size" class="solr.SizeComponent" />  

2- Use the component in some handler, using the default /select handler will make it easier
to use:

    <requestHandler name="/select" class="solr.SearchHandler">  
    ... 
        <arr name="last-components">  
      ...
            <str>size</str>  
        </arr>  
    </requestHandler>  

h2. Usage

Once you have configured the SizeComponent, it can be requested by enabling it in a standard
query:

http://localhost:8983/solr/core/select?q=*:*&rows=0&wt=xml&size=true

h3. Parameters
||name||type||default||description||
|size|boolean|false|If set to true, sizing information will be included in response.|
|avgDocSize|long|0|	Document size used to compute resource usage. If less than 1, the value
will be computed using the content of currently indexed documents.|
|numDocs|long|0|Number of documents to use when computing resource usage. If less than 1,
actual number of indexed documents will be used. This parameter will be ignored if estimationRatio
is specified.|
|estimationRatio|double|0.0|Ratio used for resource usage estimations. If a value greater
than 0.0 is specified, the current number of documents will be multiplied by this ratio in
order to determine number of documents to be used when computing resource usage.|
|deletedDocs|long|-|	If specified, will be used as number of deleted documents in the index
when computing resource usage, otherwise, current number of deleted documents will be used
instead.|
|filterCacheMax|long|-|Size of the filter cache to use for computing resource usage, if not
specified, current filter cache size will be used.|
|queryResultCacheMax|long|-|Size of the query result cache to use for computing resource usage,
if not specified, current query result cache size will be used.|
|documentCacheMax|long|-|Size of the document cache to use for computing resource usage, if
not specified, current document cache size will be used.|
|queryResultMaxDocsCached|long|-|Maximum number of documents to cache per entry in query result
cache to use for computing resource usage, if not specified, current maximum will be used.|

h3. Response

{code:xml}
    <?xml version="1.0" encoding="UTF-8"?>  
    <response>  
    <lst name="responseHeader">  
      <int name="status">0</int>  
      <int name="QTime">109</int>  
      <lst name="params">  
        <str name="q">*:*</str>  
        <str name="size">true</str>  
        <str name="indent">true</str>  
        <str name="rows">0</str>  
        <str name="wt">xml</str>  
      </lst>  
    </lst>  
    <result name="response" numFound="2287" start="0">  
    </result>  
    <lst name="size">  
      <str name="total-disk-size">199.6 MB</str>  
      <str name="total-lucene-RAM">33.35 MB</str>  
      <str name="total-solr-RAM">79.16 MB</str>  
      <long name="estimated-num-docs">2287</long>  
      <str name="estimated-doc-size">89.37 KB</str>  
      <lst name="solr-details">  
        <str name="filterCache">152.94 KB</str>  
        <str name="queryResultCache">1,000 KB</str>  
        <str name="documentCache">44.68 MB</str>  
        <str name="luceneRam">33.35 MB</str>  
      </lst>  
    </lst>  
    </response>  
{code}

||result field|| ||description||
|total-disk-size| |Estimation of total disk space used by the index according to parameters.|
|total-lucene-RAM| |Estimation of index RAM usage specifically for Lucene according to parameters.|
|total-solr-RAM| |Estimation of total index RAM usage for Solr (including Lucene) according
to parameters.|
|estimated-num-docs| |Number of documents used for computing estimated values.|
|estimated-doc-size| |Average size of document used for computing estimated values.|
|solr-details|filterCache|Estimated maximum amount of RAM used for caching filters for the
index, if cache was filled.|
| |queryResultCache|Estimated maximum amount of RAM used for caching query results for the
index, if cache was filled.|
| |documentCache|Estimated maximum amount of RAM used for caching documents for the index,
if cache was filled.|
| |luceneRam|Estimated amount of RAM used by Lucene for the index.|

 
h1. Cluster Sizing

The cluster sizing action of the collection handler is intended to estimate resource usage
for a complete Solr cluster. It is based on the Size Component and will perform calls to it
internally in order to merge the results and compute aggregated estimations. It does not require
any specific configuration, but requires that the SizeComponent is declared and used by the
/select handler so that the ClusterSizing action can perform requests to it.

h2. Usage

The cluster sizing action can be accessed through the collections handler:

http://localhost:8983/solr/admin/collections?action=clustersizing

h3. Parameters

All parameters from the SizeComponent, except for size parameter itself, can be passed to
the cluster sizing action and will be relayed to the SizeComponent when estimating resource
usage. Below is the list of parameters specific to this action, for SizeComponent parameters,
see the parameter table for it.

||name||type||default||description||
|collection|string|-|List of collections (CSV) to be included in the report, if not specified,
all collections will be included.|
|shard|string|-|List of shards (CSV) to be included in the report, if not specified, all shards
will be included.|
|replica|string|-|List of replicas (CSV) to be included in the report, if not specified, all
replicas will be included.|

h3. Response

The response fields are the same as for SizeComponent, but will be grouped in 2 ways. First,
each node will have estimated total usage, then all collections will have details grouped
by shards and then for each replica.

> Component for Solr resource usage planning
> ------------------------------------------
>
>                 Key: SOLR-8393
>                 URL: https://issues.apache.org/jira/browse/SOLR-8393
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Steve Molloy
>         Attachments: SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch,
SOLR-8393.patch, SOLR-8393.patch
>
>
> One question that keeps coming back is how much disk and RAM do I need to run Solr. The
most common response is that it highly depends on your data. While true, it makes for frustrated
users trying to plan their deployments. 
> The idea I'm bringing is to create a new component that will attempt to extrapolate resources
needed in the future by looking at resources currently used. By adding a parameter for the
target number of documents, current resources are adapted by a ratio relative to current number
of documents.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message