lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kshitij tyagi <kshitij.shopcl...@gmail.com>
Subject Re: Queries regarding solr cache
Date Mon, 05 Dec 2016 13:44:51 GMT
Hi Shawn,

Thanks for the reply:

here are the details for query result cache(i am not using NOW in my
queries and most of the queries are common):


   - class:org.apache.solr.search.LRUCache
   - version:1.0
   - description:LRU Cache(maxSize=1000, initialSize=1000,
   autowarmCount=10,
   regenerator=org.apache.solr.search.SolrIndexSearcher$3@73380510)
   - src:null
   - stats:
      - lookups:381
      - hits:24
      - hitratio:0.06
      - inserts:363
      - evictions:0
      - size:345
      - warmupTime:2932
      - cumulative_lookups:294948
      - cumulative_hits:15840
      - cumulative_hitratio:0.05
      - cumulative_inserts:277963
      - cumulative_evictions:70078

      How can I increase my hit ratio? I am not able to understand solr
      caching mechanism clearly. Please help.



On Thu, Dec 1, 2016 at 8:19 PM, Shawn Heisey <apache@elyograg.org> wrote:

> On 12/1/2016 4:04 AM, kshitij tyagi wrote:
> > I am using Solr and serving huge number of requests in my application.
> >
> > I need to know how can I utilize caching in Solr.
> >
> > As of now in  then clicking Core Selector → [core name] → Plugins /
> Stats.
> >
> > I am seeing my hit ration as 0 for all the caches. What does this mean
> and
> > how this can be optimized.
>
> If your hitratio is zero, then none of the queries related to that cache
> are finding matches.  This means that your client systems are never
> sending the same query twice.
>
> One possible reason for a zero hitratio is using "NOW" in date queries
> -- NOW changes every millisecond, and the actual timestamp value is what
> ends up in the cache.  This means that the same query with NOW executed
> more than once will actually be different from the cache's perspective.
> The solution is date rounding -- using things like NOW/HOUR or NOW/DAY.
> You could use NOW/MINUTE, but the window for caching would be quite small.
>
> 5000 entries for your filterCache is almost certainly too big.  Each
> filterCache entry tends to be quite large.  If the core has ten million
> documents in it, then each filterCache entry would be 1.25 million bytes
> in size -- the entry is a bitset of all documents in the core.  This
> includes deleted docs that have not yet been reclaimed by merging.  If a
> filterCache for an index that size (which is not all that big) were to
> actually fill up with 5000 entries, it would require over six gigabytes
> of memory just for the cache.
>
> The 1000 that you have on queryResultCache is also rather large, but
> probably not a problem.  There's also documentCache, which generally is
> OK to have sized at several thousand -- I have 16384 on mine.  If your
> documents are particularly large, then you probably would want to have a
> smaller number.
>
> It's good that your autowarmCount values are low.  High values here tend
> to make commits take a very long time.
>
> You do not need to send your message more than once.  The first repeat
> was after less than 40 minutes.  The second was after about two hours.
> Waiting a day or two for a response, particularly for a difficult
> problem, is not unusual for a mailing list.  I begain this reply as soon
> as I saw your message -- about 7:30 AM in my timezone.
>
> Thanks,
> Shawn
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message