lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: documentCache not used in 4.3.1?
Date Tue, 02 Jul 2013 11:41:28 GMT
This takes some significant custom code, but...

One strategy is to keep your commits relatively
lengthy (depends on the ingest rate) and keep
a "side car" index either a small core or a
RAMDirectory. Then at search time you "somehow"
combine the two results. The "somehow" is a
bit tricky since the scores may not  be comparable.
If you're sorting it's trivial, but what you describe
doesn't sound like it's sorted as opposed to score.
Or more accurately, it sounds like you're sorting
by score.

But none of that is worthwhile if you're getting
good enough results as it stands.

Best
Erick


On Mon, Jul 1, 2013 at 12:28 PM, Daniel Collins <danwcollins@gmail.com>wrote:

> Regrettably, visibility is key for us :(  Documents must be searchable as
> soon as they have been indexed (or as near as we can make it).  Our old
> search system didn't do relevance sort, it was time-ordered (so it had a
> much simpler job) but it did have sub-second latency, and that is what is
> expected for its replacement (I know Solr doesn't like <1s currently, but
> we live in hope!).  Tried explaining that by doing relevance sort we are
> searching 100% of the collection, instead of the ~10%-20% a time-ordered
> sort did (it effectively sharded by date and only searched as far back as
> it needed to fill a page of results), but that tends to get blank looks
> from business. :)
>
> One of life's little challenges.
>
>
> On 1 July 2013 11:10, Erick Erickson <erickerickson@gmail.com> wrote:
>
> > Daniel:
> >
> > Soft commits invalidate the "top level" caches, which include
> > things like filterCache, queryResultCache etc. Various
> > "segment-level" caches are NOT invalidated, but you really
> > don't have a lot of control from the Solr level over those
> > anyway.
> >
> > But yeah, the tension between caching a bunch of stuff
> > for query speedups and NRT is still with us. Soft commits
> > are much less expensive than hard commits, but not being
> > able to use the caches as much is the price. You're right
> > that with such frequent autocommits, autowarming
> > probably is not worth the effort.
> >
> > The question I always ask is whether 1 second is really
> > necessary. Or, more accurately, worth the price. Often
> > it's not and lengthening it out significantly may be an option,
> > but that's a discussion for you to have with your product
> > manager <G>....
> >
> > I have seen configurations that have a more frequent hard
> > commit (openSearcher=false) than soft commit. The
> > mantra is "soft commits are about visibility, hard commits
> > are about durability".
> >
> > FWIW,
> > Erick
> >
> >
> > On Mon, Jul 1, 2013 at 3:40 AM, Daniel Collins <danwcollins@gmail.com
> > >wrote:
> >
> > > We see similar results, again we softCommit every 1s (trying to get as
> > NRT
> > > as we can), and we very rarely get any hits in our caches.  As an
> > > unscheduled test last week, we did shutdown indexing and noticed about
> > 80%
> > > hit rate in caches (and average query time dropped from ~1s to 100ms!)
> > so I
> > > think we are in the same position as you.
> > >
> > > I appreciate with such a frequent soft commit that the caches get
> > > invalidated, but I was expecting cache warming to help though it
> doesn't
> > > appear to be.  We *don't* currently run a warming query, my impression
> of
> > > NRT was that it was better to not do that as otherwise you spend more
> > time
> > > warming the searcher and caches, and by the time you've done all that,
> > the
> > > searcher is invalidated anyway!
> > >
> > >
> > > On 30 June 2013 01:58, Tim Vaillancourt <tim@elementspace.com> wrote:
> > >
> > > > That's a good idea, I'll try that next week.
> > > >
> > > > Thanks!
> > > >
> > > > Tim
> > > >
> > > >
> > > > On 29/06/13 12:39 PM, Erick Erickson wrote:
> > > >
> > > >> Tim:
> > > >>
> > > >> Yeah, this doesn't make much sense to me either since,
> > > >> as you say, you should be seeing some metrics upon
> > > >> occasion. But do note that the underlying cache only gets
> > > >> filled when getting documents to return in query results,
> > > >> since there's no autowarming going on it may come and
> > > >> go.
> > > >>
> > > >> But you can test this pretty quickly by lengthening your
> > > >> autocommit interval or just not indexing anything
> > > >> for a while, then run a bunch of queries and look at your
> > > >> cache stats. That'll at least tell you whether it works at all.
> > > >> You'll have to have hard commits turned off (or openSearcher
> > > >> set to 'false') for that check too.
> > > >>
> > > >> Best
> > > >> Erick
> > > >>
> > > >>
> > > >> On Sat, Jun 29, 2013 at 2:48 PM, Vaillancourt, Tim<
> > TVaillancourt@ea.com
> > > >*
> > > >> *wrote:
> > > >>
> > > >>  Yes, we are softCommit'ing every 1000ms, but that should be enough
> > time
> > > >>> to
> > > >>> see metrics though, right? For example, I still get non-cumulative
> > > >>> metrics
> > > >>> from the other caches (which are also throw away). I've also
> > > curl/sampled
> > > >>> enough that I probably should have seen a value by now.
> > > >>>
> > > >>> If anyone else can reproduce this on 4.3.1 I will feel less crazy
> :).
> > > >>>
> > > >>> Cheers,
> > > >>>
> > > >>> Tim
> > > >>>
> > > >>> -----Original Message-----
> > > >>> From: Erick Erickson [mailto:erickerickson@gmail.**com<
> > > erickerickson@gmail.com>
> > > >>> ]
> > > >>> Sent: Saturday, June 29, 2013 10:13 AM
> > > >>> To: solr-user@lucene.apache.org
> > > >>> Subject: Re: documentCache not used in 4.3.1?
> > > >>>
> > > >>> It's especially weird that the hit ratio is so high and you're
not
> > > seeing
> > > >>> anything in the cache. Are you perhaps soft committing frequently?
> > Soft
> > > >>> commits throw away all the top-level caches including
> documentCache I
> > > >>> think....
> > > >>>
> > > >>> Erick
> > > >>>
> > > >>>
> > > >>> On Fri, Jun 28, 2013 at 7:23 PM, Tim Vaillancourt<tim@elementspace.
> > > **com<tim@elementspace.com>
> > > >>>
> > > >>>> wrote:
> > > >>>> Thanks Otis,
> > > >>>>
> > > >>>> Yeah I realized after sending my e-mail that doc cache does
not
> > warm,
> > > >>>> however I'm still lost on why there are no other metrics.
> > > >>>>
> > > >>>> Thanks!
> > > >>>>
> > > >>>> Tim
> > > >>>>
> > > >>>>
> > > >>>> On 28 June 2013 16:22, Otis Gospodnetic<otis.gospodnetic@**
> > gmail.com<
> > > otis.gospodnetic@gmail.com>
> > > >>>> >
> > > >>>> wrote:
> > > >>>>
> > > >>>>  Hi Tim,
> > > >>>>>
> > > >>>>> Not sure about the zeros in 4.3.1, but in SPM we see all
these
> > > >>>>> numbers are non-0, though I haven't had the chance to
confirm
> with
> > > >>>>>
> > > >>>> Solr 4.3.1.
> > > >>>
> > > >>>> Note that you can't really autowarm document cache...
> > > >>>>>
> > > >>>>> Otis
> > > >>>>> --
> > > >>>>> Solr&  ElasticSearch Support -- http://sematext.com/
Performance
> > > >>>>>
> > > >>>>> Monitoring -- http://sematext.com/spm
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, Jun 28, 2013 at 7:14 PM, Tim Vaillancourt
> > > >>>>> <tim@elementspace.com>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Hey guys,
> > > >>>>>>
> > > >>>>>> This has to be a stupid question/I must be doing something
> wrong,
> > > >>>>>> but
> > > >>>>>>
> > > >>>>> after
> > > >>>>>
> > > >>>>>> frequent load testing with documentCache enabled under
Solr
> 4.3.1
> > > >>>>>> with autoWarmCount=150, I'm noticing that my documentCache
> metrics
> > > >>>>>> are
> > > >>>>>>
> > > >>>>> always
> > > >>>>
> > > >>>>> zero for non-cumlative.
> > > >>>>>>
> > > >>>>>> At first I thought my commit rate is fast enough I
just never
> see
> > > >>>>>> the non-cumlative result, but after 100s of samples
I still
> always
> > > >>>>>> get zero values.
> > > >>>>>>
> > > >>>>>> Here is the current output of my documentCache from
Solr's admin
> > > >>>>>> for 1
> > > >>>>>>
> > > >>>>> core:
> > > >>>>>
> > > >>>>>> "
> > > >>>>>>
> > > >>>>>>     - documentCache<
> > > >>>>>>
> > > >>>>> http://localhost:8983/solr/#/**channels_shard1_replica2/**
> > > >>>> plugins/cache?en<
> > > http://localhost:8983/solr/#/channels_shard1_replica2/plugins/cache?en
> >
> > > >>>> try=documentCache
> > > >>>>
> > > >>>>>        - class:org.apache.solr.search.**LRUCache
> > > >>>>>>        - version:1.0
> > > >>>>>>        - description:LRU Cache(maxSize=512, initialSize=512,
> > > >>>>>>        autowarmCount=150, regenerator=null)
> > > >>>>>>        - src:$URL: https:/
> > > >>>>>>        /
> svn.apache.org/repos/asf/**lucene/dev/branches/lucene_**
> > > >>>>>> solr_4_3/<
> > > http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_3/>
> > > >>>>>>
>  solr/core/src/java/org/apache/**solr/search/LRUCache.java<
> > > >>>>>>
> > > >>>>> https://svn.apache.org/repos/**asf/lucene/dev/branches/**
> > > >>>> lucene_solr_4_3/s<
> > > https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_3/s
> >
> > > >>>> olr/core/src/java/org/apache/**solr/search/LRUCache.java
> > > >>>>
> > > >>>>> $
> > > >>>>>>        - stats:
> > > >>>>>>           - lookups:0
> > > >>>>>>           - hits:0
> > > >>>>>>           - hitratio:0.00
> > > >>>>>>           - inserts:0
> > > >>>>>>           - evictions:0
> > > >>>>>>           - size:0
> > > >>>>>>           - warmupTime:0
> > > >>>>>>           - cumulative_lookups:65198986
> > > >>>>>>           - cumulative_hits:63075669
> > > >>>>>>           - cumulative_hitratio:0.96
> > > >>>>>>           - cumulative_inserts:2123317
> > > >>>>>>           - cumulative_evictions:1010262
> > > >>>>>>        "
> > > >>>>>>
> > > >>>>>> The cumulative values seem to rise, suggesting doc
cache is
> > > >>>>>> working,
> > > >>>>>>
> > > >>>>> but
> > > >>>>
> > > >>>>> at
> > > >>>>>
> > > >>>>>> the same time it seems I never see non-cumlative metrics,
most
> > > >>>>>>
> > > >>>>> importantly
> > > >>>>>
> > > >>>>>> warmupTime.
> > > >>>>>>
> > > >>>>>> Am I doing something wrong, is this normal/by-design,
or is
> there
> > > >>>>>> an
> > > >>>>>>
> > > >>>>> issue
> > > >>>>>
> > > >>>>>> here?
> > > >>>>>>
> > > >>>>>> Thanks for helping with my silly question! Have a
good weekend,
> > > >>>>>>
> > > >>>>>> Tim
> > > >>>>>>
> > > >>>>>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message