lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: old searchers not closing after optimize or replication
Date Wed, 20 Apr 2011 15:50:43 GMT
It looks OK, but still doesn't explain keeping the old files around. What is
your <deletionPolicy> in your solrconfig.xml look like? It's
possible that you're seeing Solr attempt to keep around several
optimized copies of the index, but that still doesn't explain why
restarting Solr removes them unless the deletionPolicy gets invoked
on sometime and you're index files are aging out (I don't know the
internals of deletion well enough to say).

About optimization. It's become less important with recent code. Once
upon a time, it made a substantial difference in search speed. More
recently, it has very little impact on search speed, and is used
much more sparingly. Its greatest benefit is reclaiming unused resources
left over from deleted documents. So you might want to avoid the pain
of optimizing (44 minutes!) and only optimize rarely of if you have
deleted a lot of documents.

It might be worthwhile to try (with a smaller index !) a bunch of optimize
cycles and see if the <deletionPolicy> idea has any merit. I'd expect
your index to reach a maximum and stay there after the saved
copies of the index was reached...

But otherwise I'm puzzled...

Erick

On Wed, Apr 20, 2011 at 10:30 AM, Bernd Fehling
<bernd.fehling@uni-bielefeld.de> wrote:
> Hi Erik,
>
> Am 20.04.2011 15:42, schrieb Erick Erickson:
>>
>> Hmmmm, this isn't right. You've pretty much eliminated the obvious
>> things. What does lsof show? I'm assuming it shows the files are
>> being held open by your Solr instance, but it's worth checking.
>
> Just commited new content 3 times and finally optimized.
> Again having old index files left.
>
> Then checked on my master, only the newest version of index files are
> listed with lsof. No file handles to the old index files but the
> old index files remain in data/index/.
> Thats strange.
>
> This time replication worked fine and cleaned up old index on slaves.
>
>>
>> I'm not getting the same behavior, admittedly on a Windows box.
>> The only other thing I can think of is that you have a query that's
>> somehow never ending, but that's grasping at straws.
>>
>> Do your log files show anything interesting?
>
> Lets see:
> - it has the old generation (generation=12) and its files
> - and recognizes that there have been several commits (generation=18)
>
> 20.04.2011 14:05:26 org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start
> commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDeletes=false)
> 20.04.2011 14:05:26 org.apache.solr.core.SolrDeletionPolicy onInit
> INFO: SolrDeletionPolicy.onInit: commits:num=2
>
>  commit{dir=/srv/www/solr/solr/solrserver/solr/data/index,segFN=segments_c,version=1302159868435,generation=12,filenames=[_3xm.nrm,
> _3xm.fdx, segment
> s_c, _3xm.fnm, _3xm.fdt, _3xm.tis, _3xm.tii, _3xm.prx, _3xm.frq]
>
>  commit{dir=/srv/www/solr/solr/solrserver/solr/data/index,segFN=segments_i,version=1302159868447,generation=18,filenames=[_3xm.nrm,
> _3xo.tis, _3xp.pr
> x, _3xo.fnm, _3xp.fdx, _3xs.frq, _3xo.tii, _3xp.fdt, _3xn.tii, _3xm.fdx,
> _3xn.nrm, _3xm.fdt, _3xs.prx, _3xn.tis, _3xn.fdx, _3xr.nrm, _3xm.prx,
> _3xn.fdt, _3x
> p.tii, _3xs.nrm, _3xp.tis, _3xo.prx, segments_i, _3xm.tii, _3xq.tii,
> _3xs.fdx, _3xs.fdt, _3xo.frq, _3xn.prx, _3xm.tis, _3xr.prx, _3xq.tis,
> _3xo.fdt, _3xp.fr
> q, _3xq.fnm, _3xo.fdx, _3xp.fnm, _3xr.tis, _3xr.fnm, _3xq.frq, _3xr.tii,
> _3xr.frq, _3xo.nrm, _3xs.tii, _3xq.fdx, _3xq.fdt, _3xp.nrm, _3xq.prx,
> _3xs.tis, _3x
> m.frq, _3xr.fdx, _3xm.fnm, _3xn.frq, _3xq.nrm, _3xs.fnm, _3xn.fnm, _3xr.fdt]
> 20.04.2011 14:05:26 org.apache.solr.core.SolrDeletionPolicy updateCommits
> INFO: newest commit = 1302159868447
>
>
> - after 44 minutes of optimizing (over 140GB and 27.8 mio docs) it gets
>  the SolrDeletionPolicy onCommit and has the new generation 19 listed.
>
>
> 20.04.2011 14:49:25 org.apache.solr.core.SolrDeletionPolicy onCommit
> INFO: SolrDeletionPolicy.onCommit: commits:num=3
>
>  commit{dir=/srv/www/solr/solr/solrserver/solr/data/index,segFN=segments_c,version=1302159868435,generation=12,filenames=[_3xm.nrm,
> _3xm.fdx, segment
> s_c, _3xm.fnm, _3xm.fdt, _3xm.tis, _3xm.tii, _3xm.prx, _3xm.frq]
>
>  commit{dir=/srv/www/solr/solr/solrserver/solr/data/index,segFN=segments_i,version=1302159868447,generation=18,filenames=[_3xm.nrm,
> _3xo.tis, _3xp.pr
> x, _3xo.fnm, _3xp.fdx, _3xs.frq, _3xo.tii, _3xp.fdt, _3xn.tii, _3xm.fdx,
> _3xn.nrm, _3xm.fdt, _3xs.prx, _3xn.tis, _3xn.fdx, _3xr.nrm, _3xm.prx,
> _3xn.fdt, _3x
> p.tii, _3xs.nrm, _3xp.tis, _3xo.prx, segments_i, _3xm.tii, _3xq.tii,
> _3xs.fdx, _3xs.fdt, _3xo.frq, _3xn.prx, _3xm.tis, _3xr.prx, _3xq.tis,
> _3xo.fdt, _3xp.fr
> q, _3xq.fnm, _3xo.fdx, _3xp.fnm, _3xr.tis, _3xr.fnm, _3xq.frq, _3xr.tii,
> _3xr.frq, _3xo.nrm, _3xs.tii, _3xq.fdx, _3xq.fdt, _3xp.nrm, _3xq.prx,
> _3xs.tis, _3x
> m.frq, _3xr.fdx, _3xm.fnm, _3xn.frq, _3xq.nrm, _3xs.fnm, _3xn.fnm, _3xr.fdt]
>
>  commit{dir=/srv/www/solr/solr/solrserver/solr/data/index,segFN=segments_j,version=1302159868449,generation=19,filenames=[_3xt.fnm,
> _3xt.nrm, _3xt.fr
> q, _3xt.fdt, _3xt.tis, _3xt.fdx, segments_j, _3xt.prx, _3xt.tii]
> 20.04.2011 14:49:25 org.apache.solr.core.SolrDeletionPolicy updateCommits
> INFO: newest commit = 1302159868449
>
>
> - it starts a new searcher and warms it up
> - it sends SolrIndexSearcher close
>
>
> 20.04.2011 14:49:29 org.apache.solr.search.SolrIndexSearcher <init>
> INFO: Opening Searcher@2c37425f main
> 20.04.2011 14:49:29 org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> 20.04.2011 14:49:29 org.apache.solr.search.SolrIndexSearcher warm
> ...
> 20.04.2011 14:49:29 org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener sending requests to Searcher@2c37425f main
> 20.04.2011 14:49:29 org.apache.solr.core.SolrCore execute
> INFO: [] webapp=null path=null
> params={facet=true&start=0&event=newSearcher&q=solr&facet.limit=100&facet.field=f_dcyear&rows=10}
> hits=96 status=0 QTime=816
> 20.04.2011 14:49:30 org.apache.solr.core.SolrCore execute
> INFO: [] webapp=null path=null
> params={facet=true&start=0&event=newSearcher&q=*:*&facet.limit=100&facet.field=f_dcyear&rows=10}
> hits=27826100 status=0 QTime=633
> 20.04.2011 14:49:30 org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener done.
> 20.04.2011 14:49:30 org.apache.solr.core.SolrCore registerSearcher
> INFO: [] Registered new searcher Searcher@2c37425f main
> 20.04.2011 14:49:30 org.apache.solr.search.SolrIndexSearcher close
> INFO: Closing Searcher@10faa79a main
> fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=
> 0.00,cumulative_inserts=0,cumulative_evictions=0}
> filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=2,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00
> ,cumulative_inserts=0,cumulative_evictions=0}
> queryResultCache{lookups=1,hits=0,hitratio=0.00,inserts=3,evictions=0,size=3,warmupTime=0,cumulative_lookups=6,cumulative_hits=0,cumulative_hitratio
> =0.00,cumulative_inserts=6,cumulative_evictions=0}
> documentCache{lookups=20,hits=10,hitratio=0.50,inserts=30,evictions=0,size=30,warmupTime=0,cumulative_lookups=120,cumulative_hits=60,cumulative_hitr
> atio=0.50,cumulative_inserts=60,cumulative_evictions=0}
> 20.04.2011 14:49:30 org.apache.solr.update.processor.LogUpdateProcessor
> finish
> INFO: {optimize=} 0 2644098
>
>
> Actually I cant see anything to worry about.
>
> What is your opinion?
>
>
> Best regards
> Bernd
>
>>
>> Best
>> Erick@NotMuchHelpIKnow
>>
>> On Wed, Apr 20, 2011 at 8:37 AM, Bernd Fehling
>> <bernd.fehling@uni-bielefeld.de>  wrote:
>>>
>>> Hi Erik,
>>>
>>> Am 20.04.2011 13:56, schrieb Erick Erickson:
>>>>
>>>> Does this persist? In other words, if you just watch it for
>>>> some time, does the disk usage go back to normal?
>>>
>>> Only after restarting the whole solr the disk usage goes back to normal.
>>>
>>>>
>>>> Because it's typical that your index size will temporarily
>>>> spike after the operations you describe as new searchers
>>>> are warmed up. During that interval, both the old and new
>>>> searchers are open.
>>>
>>> Temporarily yes, but still after a couple of hours after optimize
>>> or replication?
>>>
>>>>
>>>> Look particularly at your warmup time in the Solr admin page,
>>>> that should give you an indication of how long it takes your
>>>> warmup to happen and give you a clue about when you should
>>>> expect the index sizes to drop again.
>>>
>>> We have newSearcher and firstSearcher (both with 2 simple queries) and
>>> <useColdSearcher>false</useColdSearcher>
>>> <maxWarmingSearchers>2</maxWarmingSearchers>
>>> The QTime is less than 500 (0.5 second).
>>>
>>> warmupTime=0 for all autowarming Searcher
>>>
>>>>
>>>> How often do you optimize on the master and replicate on the
>>>> slave? Because you may be getting into the runaway warmup
>>>> problem where a new searcher is opened before the last one
>>>> is autowarmed and spiraling out of control.
>>>
>>> We commit new content about every hour and do an optimze once a day.
>>> So replication is also once a day after optimize finished and
>>> system has settled down.
>>> No commit during optimize and replication.
>>>
>>>
>>> Any further hints?
>>>
>>>
>>>>
>>>> Hope that helps
>>>> Erick
>>>>
>>>> On Wed, Apr 20, 2011 at 2:36 AM, Bernd Fehling
>>>> <bernd.fehling@uni-bielefeld.de>    wrote:
>>>>>
>>>>> Hello list,
>>>>>
>>>>> we have the problem that old searchers often are not closing
>>>>> after optimize (on master) or replication (on slaves) and
>>>>> therefore have huge index volumes.
>>>>> Only solution so far is to stop and start solr which cleans
>>>>> up everything successfully, but this can only be a workaround.
>>>>>
>>>>> Is the parameter "waitSearcher=false" an option to solve this?
>>>>>
>>>>> Any hints what to check or to debug?
>>>>>
>>>>> We use Apache Solr 3.1.0 on Linux.
>>>>>
>>>>> Regards
>>>>> Bernd
>>>>>
>>>
>
> --
> *************************************************************
> Bernd Fehling                Universitätsbibliothek Bielefeld
> Dipl.-Inform. (FH)                        Universitätsstr. 25
> Tel. +49 521 106-4060                   Fax. +49 521 106-4052
> bernd.fehling@uni-bielefeld.de                33615 Bielefeld
>
> BASE - Bielefeld Academic Search Engine - www.base-search.net
> *************************************************************
>

Mime
View raw message