lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomás Fernández Löbbe <tomasflo...@gmail.com>
Subject Re: 7.3 appears to leak
Date Wed, 05 Sep 2018 16:36:11 GMT
I created SOLR-12743 to track this.

On Mon, Jul 16, 2018 at 12:30 PM Markus Jelsma <markus.jelsma@openindex.io>
wrote:

> Hello Thomas,
>
> To be absolutely sure you suffer from the same problem as one of our
> collections, can you confirm that your Solr cores are leaking a
> SolrIndexSearcher instance on each commit? If not, there may be a second
> problem.
>
> Also, do you run any custom plugins or apply patches to your Solr
> instances? Or is your Solr a 100 % official build?
>
> Thanks,
> Markus
>
>
>
> -----Original message-----
> > From:Thomas Scheffler <thomas.scheffler@uni-jena.de>
> > Sent: Monday 16th July 2018 13:39
> > To: solr-user@lucene.apache.org
> > Subject: Re: 7.3 appears to leak
> >
> > Hi,
> >
> > we noticed the same problems here in a rather small setup. 40.000
> metadata documents with nearly as much files that have „literal.*“ fields
> with it. While 7.2.1 has brought some tika issues the real problems started
> to appear with version 7.3.0 which are currently unresolved in 7.4.0.
> Memory consumption is out-of-roof. Where previously 512MB heap was enough,
> now 6G aren’t enough to index all files.
> >
> > kind regards,
> >
> > Thomas
> >
> > > Am 04.07.2018 um 15:03 schrieb Markus Jelsma <
> markus.jelsma@openindex.io>:
> > >
> > > Hello Andrey,
> > >
> > > I didn't think of that! I will try it when i have the courage again,
> probably next week or so.
> > >
> > > Many thanks,
> > > Markus
> > >
> > >
> > > -----Original message-----
> > >> From:Kydryavtsev Andrey <werder06@yandex.ru>
> > >> Sent: Wednesday 4th July 2018 14:48
> > >> To: solr-user@lucene.apache.org
> > >> Subject: Re: 7.3 appears to leak
> > >>
> > >> If it is not possible to find a resource leak by code analysis and
> there is no better ideas, I can suggest a brute force approach:
> > >> - Clone Solr's sources from appropriate branch
> https://github.com/apache/lucene-solr/tree/branch_7_3
> > >> - Log every searcher's holder increment/decrement operation in a way
> to catch every caller name (use Thread.currentThread().getStackTrace() or
> something)
> https://github.com/apache/lucene-solr/blob/branch_7_3/solr/core/src/java/org/apache/solr/util/RefCounted.java
> > >> - Build custom artefacts and upload them on prod
> > >> - After memory leak happened - analyse logs to see what part of
> functionality doesn't decrement searcher after counter was incremented. If
> searchers are leaked - there should be such code I guess.
> > >>
> > >> This is not something someone would like to do, but it is what it is.
> > >>
> > >>
> > >>
> > >> Thank you,
> > >>
> > >> Andrey Kudryavtsev
> > >>
> > >>
> > >> 03.07.2018, 14:26, "Markus Jelsma" <markus.jelsma@openindex.io>:
> > >>> Hello Erick,
> > >>>
> > >>> Even the silliest ideas may help us, but unfortunately this is not
> the case. All our Solr nodes run binaries from the same source from our
> central build server, with the same libraries thanks to provisioning. Only
> schema and config are different, but the <lib/> directive is the same all
> over.
> > >>>
> > >>> Are there any other ideas, speculations, whatever, on why only our
> main text collection leaks a SolrIndexSearcher instance on commit since
> 7.3.0 and every version up?
> > >>>
> > >>> Many thanks?
> > >>> Markus
> > >>>
> > >>> -----Original message-----
> > >>>>  From:Erick Erickson <erickerickson@gmail.com>
> > >>>>  Sent: Friday 29th June 2018 19:34
> > >>>>  To: solr-user <solr-user@lucene.apache.org>
> > >>>>  Subject: Re: 7.3 appears to leak
> > >>>>
> > >>>>  This is truly puzzling then, I'm clueless. It's hard to imagine
> this
> > >>>>  is lurking out there and nobody else notices, but you've eliminated
> > >>>>  the custom code. And this is also very peculiar:
> > >>>>
> > >>>>  * it occurs only in our main text search collection, all other
> > >>>>  collections are unaffected;
> > >>>>  * despite what i said earlier, it is so far unreproducible outside
> > >>>>  production, even when mimicking production as good as we can;
> > >>>>
> > >>>>  Here's a tedious idea. Restart Solr with the -v option, I _think_
> that
> > >>>>  shows you each and every jar file Solr loads. Is it "somehow"
> possible
> > >>>>  that your main collection is loading some jar from somewhere that's
> > >>>>  different than you expect? 'cause silly ideas like this are all
I
> can
> > >>>>  come up with.
> > >>>>
> > >>>>  Erick
> > >>>>
> > >>>>  On Fri, Jun 29, 2018 at 9:56 AM, Markus Jelsma
> > >>>>  <markus.jelsma@openindex.io> wrote:
> > >>>>  > Hello Erick,
> > >>>>  >
> > >>>>  > The custom search handler doesn't interact with
> SolrIndexSearcher, this is really all it does:
> > >>>>  >
> > >>>>  >   public void handleRequestBody(SolrQueryRequest req,
> SolrQueryResponse rsp) throws Exception {
> > >>>>  >     super.handleRequestBody(req, rsp);
> > >>>>  >
> > >>>>  >     if (rsp.getToLog().get("hits") instanceof Integer) {
> > >>>>  >       rsp.addHttpHeader("X-Solr-Hits",
> String.valueOf((Integer)rsp.getToLog().get("hits")));
> > >>>>  >     }
> > >>>>  >     if (rsp.getToLog().get("hits") instanceof Long) {
> > >>>>  >       rsp.addHttpHeader("X-Solr-Hits",
> String.valueOf((Long)rsp.getToLog().get("hits")));
> > >>>>  >     }
> > >>>>  >   }
> > >>>>  >
> > >>>>  > I am not sure this qualifies as one more to go.
> > >>>>  >
> > >>>>  > Re: compiler warnings on resources, yes! This and tests failing
> due to resources leaks have always warned me when i forgot to release
> something or decrement a reference. But except for the above method (and
> the token filters which i really can't disable) are all that is left.
> > >>>>  >
> > >>>>  > I am quite desperate about this problem so although i am
> unwilling to disable stuff, i can do it if i must. But i so reason, yet, to
> remove the search handler or the token filter stuff, i mean, how could
> those leak a SolrIndexSearcher?
> > >>>>  >
> > >>>>  > Let me know :)
> > >>>>  >
> > >>>>  > Many thanks!
> > >>>>  > Markus
> > >>>>  >
> > >>>>  > -----Original message-----
> > >>>>  >> From:Erick Erickson <erickerickson@gmail.com>
> > >>>>  >> Sent: Friday 29th June 2018 18:46
> > >>>>  >> To: solr-user <solr-user@lucene.apache.org>
> > >>>>  >> Subject: Re: 7.3 appears to leak
> > >>>>  >>
> > >>>>  >> bq. The only custom stuff left is an extension of SearchHandler
> that
> > >>>>  >> only writes numFound to the response headers.
> > >>>>  >>
> > >>>>  >> Well, one more to go ;). It's incredibly easy to overlook
> > >>>>  >> innocent-seeming calls that increment the underlying
reference
> count
> > >>>>  >> of some objects but don't decrement them, usually through
a
> close
> > >>>>  >> call. Which isn't necessarily a close if the underlying
> reference
> > >>>>  >> count is still > 0.
> > >>>>  >>
> > >>>>  >> You may infer that I've been there and done that ;).
Sometime
> the
> > >>>>  >> compiler warnings about "resource leak" can help pinpoint
those
> too.
> > >>>>  >>
> > >>>>  >> Best,
> > >>>>  >> Erick
> > >>>>  >>
> > >>>>  >> On Fri, Jun 29, 2018 at 9:16 AM, Markus Jelsma
> > >>>>  >> <markus.jelsma@openindex.io> wrote:
> > >>>>  >> > Hello Yonik,
> > >>>>  >> >
> > >>>>  >> > I took one node of the 7.2.1 cluster out of the
load balancer
> so it would only receive shard queries, this way i could kind of 'safely'
> disable our custom components one by one, while keeping functionality in
> place by letting the other 7.2.1 nodes continue on with the full
> configuration.
> > >>>>  >> >
> > >>>>  >> > I am now at a point where literally all custom components
are
> deleted or commented out in the config for the node running 7.4. The only
> custom stuff left is an extension of SearchHandler that only writes
> numFound to the response headers, and all the token filters in our schema.
> > >>>>  >> >
> > >>>>  >> > You were right, it was leaking exactly one SolrIndexSearcher
> instance on each commit. But, with all our stuff gone, the leak is still
> there! I triple checked it! Of course, the bastard is locally still not
> reproducible.
> > >>>>  >> >
> > >>>>  >> > So, what is next? I have no clues left.
> > >>>>  >> >
> > >>>>  >> > Many, many thanks,
> > >>>>  >> > Markus
> > >>>>  >> >
> > >>>>  >> > -----Original message-----
> > >>>>  >> >> From:Markus Jelsma <markus.jelsma@openindex.io>
> > >>>>  >> >> Sent: Thursday 28th June 2018 23:52
> > >>>>  >> >> To: solr-user@lucene.apache.org
> > >>>>  >> >> Subject: RE: 7.3 appears to leak
> > >>>>  >> >>
> > >>>>  >> >> Hello Yonik,
> > >>>>  >> >>
> > >>>>  >> >> If leaking a whole SolrIndexSearcher would cause
this
> problem, then the only custom component would be our copy/paste-and-enhance
> version of the elevator component, is the root of all problems. It is a
> direct copy of the 7.2 source where only things like getAnalyzedQuery, the
> ElevationObj and the loop over the map entries is changed.
> > >>>>  >> >>
> > >>>>  >> >> There are no changes to code related to the
searcher. Other
> component where we get a RefCount of searcher is used without issues, we
> always decrement the reference after using it. But those components are not
> in use in this collection.
> > >>>>  >> >>
> > >>>>  >> >> The source has changed a lot with 7.4 but we
still use the
> old code. I will investigate the component thoroughly, even revert to the
> old 7.2 vanilla component for a brief period in production for one machine.
> It may not be a problem if i don't let our load balancer access it
> directly, so it only serves shard queries.
> > >>>>  >> >>
> > >>>>  >> >> I will get back to this topic tomorrow!
> > >>>>  >> >>
> > >>>>  >> >> Many thanks,
> > >>>>  >> >> Markus
> > >>>>  >> >>
> > >>>>  >> >>
> > >>>>  >> >>
> > >>>>  >> >> -----Original message-----
> > >>>>  >> >> > From:Yonik Seeley <yseeley@gmail.com>
> > >>>>  >> >> > Sent: Thursday 28th June 2018 23:30
> > >>>>  >> >> > To: solr-user@lucene.apache.org
> > >>>>  >> >> > Subject: Re: 7.3 appears to leak
> > >>>>  >> >> >
> > >>>>  >> >> > > * SortedIntDocSet instances ánd
> ConcurrentLRUCache$CacheEntry instances are both leaked on commit;
> > >>>>  >> >> >
> > >>>>  >> >> > If these are actually filterCache entries
being leaked, it
> stands to
> > >>>>  >> >> > reason that a whole searcher is being leaked
somewhere.
> > >>>>  >> >> >
> > >>>>  >> >> > -Yonik
> > >>>>  >> >> >
> > >>>>  >> >>
> > >>>>  >>
> > >>
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message