lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Collins <danwcoll...@gmail.com>
Subject Re: documentCache not used in 4.3.1?
Date Tue, 02 Jul 2013 13:02:04 GMT
Cheers, its certainly something we might end up exploring.


On 2 July 2013 12:41, Erick Erickson <erickerickson@gmail.com> wrote:

> This takes some significant custom code, but...
>
> One strategy is to keep your commits relatively
> lengthy (depends on the ingest rate) and keep
> a "side car" index either a small core or a
> RAMDirectory. Then at search time you "somehow"
> combine the two results. The "somehow" is a
> bit tricky since the scores may not  be comparable.
> If you're sorting it's trivial, but what you describe
> doesn't sound like it's sorted as opposed to score.
> Or more accurately, it sounds like you're sorting
> by score.
>
> But none of that is worthwhile if you're getting
> good enough results as it stands.
>
> Best
> Erick
>
>
> On Mon, Jul 1, 2013 at 12:28 PM, Daniel Collins <danwcollins@gmail.com
> >wrote:
>
> > Regrettably, visibility is key for us :(  Documents must be searchable as
> > soon as they have been indexed (or as near as we can make it).  Our old
> > search system didn't do relevance sort, it was time-ordered (so it had a
> > much simpler job) but it did have sub-second latency, and that is what is
> > expected for its replacement (I know Solr doesn't like <1s currently, but
> > we live in hope!).  Tried explaining that by doing relevance sort we are
> > searching 100% of the collection, instead of the ~10%-20% a time-ordered
> > sort did (it effectively sharded by date and only searched as far back as
> > it needed to fill a page of results), but that tends to get blank looks
> > from business. :)
> >
> > One of life's little challenges.
> >
> >
> > On 1 July 2013 11:10, Erick Erickson <erickerickson@gmail.com> wrote:
> >
> > > Daniel:
> > >
> > > Soft commits invalidate the "top level" caches, which include
> > > things like filterCache, queryResultCache etc. Various
> > > "segment-level" caches are NOT invalidated, but you really
> > > don't have a lot of control from the Solr level over those
> > > anyway.
> > >
> > > But yeah, the tension between caching a bunch of stuff
> > > for query speedups and NRT is still with us. Soft commits
> > > are much less expensive than hard commits, but not being
> > > able to use the caches as much is the price. You're right
> > > that with such frequent autocommits, autowarming
> > > probably is not worth the effort.
> > >
> > > The question I always ask is whether 1 second is really
> > > necessary. Or, more accurately, worth the price. Often
> > > it's not and lengthening it out significantly may be an option,
> > > but that's a discussion for you to have with your product
> > > manager <G>....
> > >
> > > I have seen configurations that have a more frequent hard
> > > commit (openSearcher=false) than soft commit. The
> > > mantra is "soft commits are about visibility, hard commits
> > > are about durability".
> > >
> > > FWIW,
> > > Erick
> > >
> > >
> > > On Mon, Jul 1, 2013 at 3:40 AM, Daniel Collins <danwcollins@gmail.com
> > > >wrote:
> > >
> > > > We see similar results, again we softCommit every 1s (trying to get
> as
> > > NRT
> > > > as we can), and we very rarely get any hits in our caches.  As an
> > > > unscheduled test last week, we did shutdown indexing and noticed
> about
> > > 80%
> > > > hit rate in caches (and average query time dropped from ~1s to
> 100ms!)
> > > so I
> > > > think we are in the same position as you.
> > > >
> > > > I appreciate with such a frequent soft commit that the caches get
> > > > invalidated, but I was expecting cache warming to help though it
> > doesn't
> > > > appear to be.  We *don't* currently run a warming query, my
> impression
> > of
> > > > NRT was that it was better to not do that as otherwise you spend more
> > > time
> > > > warming the searcher and caches, and by the time you've done all
> that,
> > > the
> > > > searcher is invalidated anyway!
> > > >
> > > >
> > > > On 30 June 2013 01:58, Tim Vaillancourt <tim@elementspace.com>
> wrote:
> > > >
> > > > > That's a good idea, I'll try that next week.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Tim
> > > > >
> > > > >
> > > > > On 29/06/13 12:39 PM, Erick Erickson wrote:
> > > > >
> > > > >> Tim:
> > > > >>
> > > > >> Yeah, this doesn't make much sense to me either since,
> > > > >> as you say, you should be seeing some metrics upon
> > > > >> occasion. But do note that the underlying cache only gets
> > > > >> filled when getting documents to return in query results,
> > > > >> since there's no autowarming going on it may come and
> > > > >> go.
> > > > >>
> > > > >> But you can test this pretty quickly by lengthening your
> > > > >> autocommit interval or just not indexing anything
> > > > >> for a while, then run a bunch of queries and look at your
> > > > >> cache stats. That'll at least tell you whether it works at all.
> > > > >> You'll have to have hard commits turned off (or openSearcher
> > > > >> set to 'false') for that check too.
> > > > >>
> > > > >> Best
> > > > >> Erick
> > > > >>
> > > > >>
> > > > >> On Sat, Jun 29, 2013 at 2:48 PM, Vaillancourt, Tim<
> > > TVaillancourt@ea.com
> > > > >*
> > > > >> *wrote:
> > > > >>
> > > > >>  Yes, we are softCommit'ing every 1000ms, but that should be
> enough
> > > time
> > > > >>> to
> > > > >>> see metrics though, right? For example, I still get
> non-cumulative
> > > > >>> metrics
> > > > >>> from the other caches (which are also throw away). I've also
> > > > curl/sampled
> > > > >>> enough that I probably should have seen a value by now.
> > > > >>>
> > > > >>> If anyone else can reproduce this on 4.3.1 I will feel less
crazy
> > :).
> > > > >>>
> > > > >>> Cheers,
> > > > >>>
> > > > >>> Tim
> > > > >>>
> > > > >>> -----Original Message-----
> > > > >>> From: Erick Erickson [mailto:erickerickson@gmail.**com<
> > > > erickerickson@gmail.com>
> > > > >>> ]
> > > > >>> Sent: Saturday, June 29, 2013 10:13 AM
> > > > >>> To: solr-user@lucene.apache.org
> > > > >>> Subject: Re: documentCache not used in 4.3.1?
> > > > >>>
> > > > >>> It's especially weird that the hit ratio is so high and you're
> not
> > > > seeing
> > > > >>> anything in the cache. Are you perhaps soft committing
> frequently?
> > > Soft
> > > > >>> commits throw away all the top-level caches including
> > documentCache I
> > > > >>> think....
> > > > >>>
> > > > >>> Erick
> > > > >>>
> > > > >>>
> > > > >>> On Fri, Jun 28, 2013 at 7:23 PM, Tim
> Vaillancourt<tim@elementspace.
> > > > **com<tim@elementspace.com>
> > > > >>>
> > > > >>>> wrote:
> > > > >>>> Thanks Otis,
> > > > >>>>
> > > > >>>> Yeah I realized after sending my e-mail that doc cache
does not
> > > warm,
> > > > >>>> however I'm still lost on why there are no other metrics.
> > > > >>>>
> > > > >>>> Thanks!
> > > > >>>>
> > > > >>>> Tim
> > > > >>>>
> > > > >>>>
> > > > >>>> On 28 June 2013 16:22, Otis Gospodnetic<otis.gospodnetic@**
> > > gmail.com<
> > > > otis.gospodnetic@gmail.com>
> > > > >>>> >
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>  Hi Tim,
> > > > >>>>>
> > > > >>>>> Not sure about the zeros in 4.3.1, but in SPM we
see all these
> > > > >>>>> numbers are non-0, though I haven't had the chance
to confirm
> > with
> > > > >>>>>
> > > > >>>> Solr 4.3.1.
> > > > >>>
> > > > >>>> Note that you can't really autowarm document cache...
> > > > >>>>>
> > > > >>>>> Otis
> > > > >>>>> --
> > > > >>>>> Solr&  ElasticSearch Support -- http://sematext.com/Performance
> > > > >>>>>
> > > > >>>>> Monitoring -- http://sematext.com/spm
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Jun 28, 2013 at 7:14 PM, Tim Vaillancourt
> > > > >>>>> <tim@elementspace.com>
> > > > >>>>> wrote:
> > > > >>>>>
> > > > >>>>>> Hey guys,
> > > > >>>>>>
> > > > >>>>>> This has to be a stupid question/I must be doing
something
> > wrong,
> > > > >>>>>> but
> > > > >>>>>>
> > > > >>>>> after
> > > > >>>>>
> > > > >>>>>> frequent load testing with documentCache enabled
under Solr
> > 4.3.1
> > > > >>>>>> with autoWarmCount=150, I'm noticing that my
documentCache
> > metrics
> > > > >>>>>> are
> > > > >>>>>>
> > > > >>>>> always
> > > > >>>>
> > > > >>>>> zero for non-cumlative.
> > > > >>>>>>
> > > > >>>>>> At first I thought my commit rate is fast enough
I just never
> > see
> > > > >>>>>> the non-cumlative result, but after 100s of samples
I still
> > always
> > > > >>>>>> get zero values.
> > > > >>>>>>
> > > > >>>>>> Here is the current output of my documentCache
from Solr's
> admin
> > > > >>>>>> for 1
> > > > >>>>>>
> > > > >>>>> core:
> > > > >>>>>
> > > > >>>>>> "
> > > > >>>>>>
> > > > >>>>>>     - documentCache<
> > > > >>>>>>
> > > > >>>>> http://localhost:8983/solr/#/**channels_shard1_replica2/**
> > > > >>>> plugins/cache?en<
> > > >
> http://localhost:8983/solr/#/channels_shard1_replica2/plugins/cache?en
> > >
> > > > >>>> try=documentCache
> > > > >>>>
> > > > >>>>>        - class:org.apache.solr.search.**LRUCache
> > > > >>>>>>        - version:1.0
> > > > >>>>>>        - description:LRU Cache(maxSize=512, initialSize=512,
> > > > >>>>>>        autowarmCount=150, regenerator=null)
> > > > >>>>>>        - src:$URL: https:/
> > > > >>>>>>        /
> > svn.apache.org/repos/asf/**lucene/dev/branches/lucene_**
> > > > >>>>>> solr_4_3/<
> > > > http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_3/
> >
> > > > >>>>>>
> >  solr/core/src/java/org/apache/**solr/search/LRUCache.java<
> > > > >>>>>>
> > > > >>>>> https://svn.apache.org/repos/**asf/lucene/dev/branches/**
> > > > >>>> lucene_solr_4_3/s<
> > > >
> https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_3/s
> > >
> > > > >>>> olr/core/src/java/org/apache/**solr/search/LRUCache.java
> > > > >>>>
> > > > >>>>> $
> > > > >>>>>>        - stats:
> > > > >>>>>>           - lookups:0
> > > > >>>>>>           - hits:0
> > > > >>>>>>           - hitratio:0.00
> > > > >>>>>>           - inserts:0
> > > > >>>>>>           - evictions:0
> > > > >>>>>>           - size:0
> > > > >>>>>>           - warmupTime:0
> > > > >>>>>>           - cumulative_lookups:65198986
> > > > >>>>>>           - cumulative_hits:63075669
> > > > >>>>>>           - cumulative_hitratio:0.96
> > > > >>>>>>           - cumulative_inserts:2123317
> > > > >>>>>>           - cumulative_evictions:1010262
> > > > >>>>>>        "
> > > > >>>>>>
> > > > >>>>>> The cumulative values seem to rise, suggesting
doc cache is
> > > > >>>>>> working,
> > > > >>>>>>
> > > > >>>>> but
> > > > >>>>
> > > > >>>>> at
> > > > >>>>>
> > > > >>>>>> the same time it seems I never see non-cumlative
metrics, most
> > > > >>>>>>
> > > > >>>>> importantly
> > > > >>>>>
> > > > >>>>>> warmupTime.
> > > > >>>>>>
> > > > >>>>>> Am I doing something wrong, is this normal/by-design,
or is
> > there
> > > > >>>>>> an
> > > > >>>>>>
> > > > >>>>> issue
> > > > >>>>>
> > > > >>>>>> here?
> > > > >>>>>>
> > > > >>>>>> Thanks for helping with my silly question! Have
a good
> weekend,
> > > > >>>>>>
> > > > >>>>>> Tim
> > > > >>>>>>
> > > > >>>>>
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message