lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Scheffler <thomas.scheff...@uni-jena.de>
Subject Memory Leak in 7.3 to 7.4
Date Thu, 02 Aug 2018 10:06:03 GMT
Hi,

we noticed a memory leak in a rather small setup. 40.000 metadata documents with nearly as
much files that have „literal.*“ fields with it. While 7.2.1 has brought some tika issues
(due to a beta version) the real problems started to appear with version 7.3.0 which are currently
unresolved in 7.4.0. Memory consumption is out-of-roof. Where previously 512MB heap was enough,
now 6G aren’t enough to index all files.
I am now to a point where I can track this down to the libraries in solr-7.4.0/contrib/extraction/lib/.
If I replace them all by the libraries shipped with 7.2.1 the problem disappears. As most
files are PDF documents I tried updating pdfbox to 2.0.11 and tika to 1.18 with no solution
to the problem. I will next try to downgrade these single libraries back to 2.0.6 and 1.16
to see if these are the source of the memory leak.

In the mean time I would like to know if anybody else experienced the same problems?

kind regards,

Thomas

Mime
View raw message