lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varun Thacker <va...@vthacker.in>
Subject Re: Lucene 6.6: "Too many open files"
Date Fri, 04 Aug 2017 16:49:57 GMT
I just ran into this issue yesterday and a couple of times last week.

The team I was working with was running Solr 5.4.1 so the auto-throttle
merge policy applies. We were using the default merge policy. When
infoStream was turned on for a dummy collection on the same box the output
for the settings looked like this:

INFO  - date time; [c:collection s:shardX r:core_nodeY
x:collection_shardX_replicaY] org.apache.solr.update.LoggingInfoStream;
[MS][qtp1595212853-1189]: initDynamicDefaults spins=true maxThreadCount=1
maxMergeCount=6

There were roughly 29k files for that replica. When we did a lsof -p <pid>
the count was less than 1500 ( we have many collections hosted on this JVM)
There are lots of .fdt files that are 56/58 bytes and it only has a
corresponding .fdx file no other segment file extensions.

The problem happened just when we started the bulk indexing process. The
collection is 16 shards X 2 replicas. This particular JVM was hosting 2
shards both writing to individual spinning disks. The bulk indexer sent
roughly 43k requests to this replica in the first minute and the average
batch size was 30 documents for the replica. So I think the map-reduce
indexer starts multiple mappers and floods the shard right in the beginning
and is causing some sort of a race condition?


On Wed, Aug 2, 2017 at 1:37 PM, Uwe Schindler <uwe@thetaphi.de> wrote:

> Hi,
>
> It's documented in the Javadocs of ConcurrentMergeScheduler. It depends on
> the number of CPUs (with some upper bound) and if the index is on SSD.
> Without SSD it uses only one thread for merging.
>
> Uwe
>
> Am 2. August 2017 22:01:51 MESZ schrieb Nawab Zada Asad Iqbal <
> khichi@gmail.com>:
> >Thanks Uwe,
> >
> >That worked actually. After running for 3 hours, I observed about 88%
> >of
> >indexing rate as compared to 4.5.0 without any file descriptor issues.
> >It
> >seems that I can probably do some tweaking to get same throughput as
> >before. I looked at the code and the default values for
> >ConcurrentMergeStrategy are -1 (and the solr process intelligently
> >decides
> >the value). Is there a way to know what is the default being employed?
> >can
> >I start with maxThreadCount and  maxMergeCount = 10 ?
> >
> >
> >Regards
> >Nawab
> >
> >On Tue, Aug 1, 2017 at 9:35 AM, Uwe Schindler <uwe@thetaphi.de> wrote:
> >
> >> Hi,
> >>
> >> You should reset those settings back to defaults (remove the inner
> >> settings in the factory). 30 merge threads will eat up all your file
> >> handles. In earlier versions of Lucene, internal limitations in
> >IndexWriter
> >> make it unlikely, that you spawn too many threads, so 30 had no
> >effect.
> >>
> >> In Lucene 6, the number of merges and threads are automatically
> >choosen by
> >> your disk type (SSD detection) and CPU count. So you should
> >definitely use
> >> defaults first and only ever change it for good reasons (if told you
> >by
> >> specialists).
> >>
> >> Uwe
> >>
> >> Am 1. August 2017 17:25:43 MESZ schrieb Nawab Zada Asad Iqbal <
> >> khichi@gmail.com>:
> >> >Thanks Jigar
> >> >I haven't tweaked ConcurrentMergeStrategy between 4.5.0 and 6.6.
> >Here
> >> >is
> >> >what I have:
> >> >
> >> ><mergeScheduler
> >> >class="org.apache.lucene.index.ConcurrentMergeScheduler">
> >> >  <int name="maxThreadCount">30</int>
> >> >  <int name="maxMergeCount">30</int>
> >> ></mergeScheduler>
> >> >
> >> >
> >> >On Mon, Jul 31, 2017 at 8:56 PM, Jigar Shah <jigaronline@gmail.com>
> >> >wrote:
> >> >
> >> >> I faced such problem when I used nomergepolicy, and did some code
> >to
> >> >manual
> >> >> merging segments which had bug and I had same issue. Make sure you
> >> >have
> >> >> default AFAIR ConcurrentMergeStrategy enabled. And its is
> >configured
> >> >with
> >> >> appropriate settings.
> >> >>
> >> >> On Jul 31, 2017 11:21 PM, "Erick Erickson"
> ><erickerickson@gmail.com>
> >> >> wrote:
> >> >>
> >> >> > No, nothing's changed fundamentally. But you say:
> >> >> >
> >> >> > "We have some batch indexing scripts, which
> >> >> > flood the solr servers with indexing requests (while keeping
> >> >> open-searcher
> >> >> > false)"
> >> >> >
> >> >> > What is your commit interval? Regardless of whether openSearcher
> >is
> >> >false
> >> >> > or not, background merging continues apace with every commit.
By
> >> >any
> >> >> chance
> >> >> > did you change your merge policy (or not copy the one from 4x
to
> >> >6x)?
> >> >> Shot
> >> >> > in the dark...
> >> >> >
> >> >> > Best,
> >> >> > Erick
> >> >> >
> >> >> > On Mon, Jul 31, 2017 at 7:15 PM, Nawab Zada Asad Iqbal
> >> ><khichi@gmail.com
> >> >> >
> >> >> > wrote:
> >> >> > > Hi,
> >> >> > >
> >> >> > > I am upgrading from solr4.5 to solr6.6 and hitting this issue
> >> >during
> >> >> > > complete reindexing scenario.  We have some batch indexing
> >> >scripts,
> >> >> which
> >> >> > > flood the solr servers with indexing requests (while keeping
> >> >> > open-searcher
> >> >> > > false) for many hours and then perform one commit. This used
> >to
> >> >work
> >> >> fine
> >> >> > > with 4.5, but with 6.6, i get 'Too many open files' within
a
> >> >couple of
> >> >> > > minutes. I have checked that "ulimit" is same between old
and
> >new
> >> >> > servers.
> >> >> > >
> >> >> > > Has something fundamentally changed in recent lucene versions,
> >> >which
> >> >> > keeps
> >> >> > > file descriptors around for a longer time?
> >> >> > >
> >> >> > >
> >> >> > > Here is a sample error message:
> >> >> > >     at org.apache.lucene.index.IndexWriter.ensureOpen(
> >> >> > IndexWriter.java:749)
> >> >> > >     at org.apache.lucene.index.IndexWriter.ensureOpen(
> >> >> > IndexWriter.java:763)
> >> >> > >     at org.apache.lucene.index.IndexWriter.commit(
> >> >> IndexWriter.java:3206)
> >> >> > >     at
> >> >> > > org.apache.solr.update.DirectUpdateHandler2.commit(
> >> >> > DirectUpdateHandler2.java:644)
> >> >> > >     at
> >> >> > >
> >> >org.apache.solr.update.processor.RunUpdateProcessor.processCommit(
> >> >> > RunUpdateProcessorFactory.java:93)
> >> >> > >     at
> >> >> > >
> >>
> >>org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(
> >> >> > UpdateRequestProcessor.java:68)
> >> >> > >     at
> >> >> > > org.apache.solr.update.processor.DistributedUpdateProcessor.
> >> >> > doLocalCommit(DistributedUpdateProcessor.java:1894)
> >> >> > >     at
> >> >> > > org.apache.solr.update.processor.DistributedUpdateProcessor.
> >> >> > processCommit(DistributedUpdateProcessor.java:1871)
> >> >> > >     at
> >> >> > > org.apache.solr.update.processor.LogUpdateProcessorFactory$
> >> >> >
> >> >LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:160)
> >> >> > >     at
> >> >> > >
> >>
> >>org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(
> >> >> > UpdateRequestProcessor.java:68)
> >> >> > >     at
> >> >> > > org.apache.solr.handler.RequestHandlerUtils.handleCommit(
> >> >> > RequestHandlerUtils.java:68)
> >> >> > >     at
> >> >> > >
> >> >org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(
> >> >> > ContentStreamHandlerBase.java:62)
> >> >> > >     at
> >> >> > > org.apache.solr.handler.RequestHandlerBase.handleRequest(
> >> >> > RequestHandlerBase.java:173)
> >> >> > >     at
> >org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> >> >> > >     at org.apache.solr.servlet.HttpSolrCall.execute(
> >> >> > HttpSolrCall.java:723)
> >> >> > >     at org.apache.solr.servlet.HttpSolrCall.call(
> >> >> HttpSolrCall.java:529)
> >> >> > >     at
> >> >> > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> >> >> > SolrDispatchFilter.java:361)
> >> >> > >     at
> >> >> > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> >> >> > SolrDispatchFilter.java:305)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> >> >> > doFilter(ServletHandler.java:1691)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.servlet.ServletHandler.doHandle(
> >> >> > ServletHandler.java:582)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.server.handler.ScopedHandler.handle(
> >> >> > ScopedHandler.java:143)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.security.SecurityHandler.handle(
> >> >> > SecurityHandler.java:548)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.server.session.SessionHandler.
> >> >> > doHandle(SessionHandler.java:226)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.server.handler.ContextHandler.
> >> >> > doHandle(ContextHandler.java:1180)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.servlet.ServletHandler.doScope(
> >> >> > ServletHandler.java:512)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.server.session.SessionHandler.
> >> >> > doScope(SessionHandler.java:185)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.server.handler.ContextHandler.
> >> >> > doScope(ContextHandler.java:1112)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.server.handler.ScopedHandler.handle(
> >> >> > ScopedHandler.java:141)
> >> >> > >     at
> >> >> > >
> >org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> >> >> > ContextHandlerCollection.java:213)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.server.handler.HandlerCollection.
> >> >> > handle(HandlerCollection.java:119)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> >> >> > HandlerWrapper.java:134)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
> >> >> > RewriteHandler.java:335)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> >> >> > HandlerWrapper.java:134)
> >> >> > >     at org.eclipse.jetty.server.Server.handle(Server.java:534)
> >> >> > >     at org.eclipse.jetty.server.HttpChannel.handle(
> >> >> HttpChannel.java:320)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.server.HttpConnection.onFillable(
> >> >> > HttpConnection.java:251)
> >> >> > >     at
> >> >> > >
> >org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
> >> >> > AbstractConnection.java:273)
> >> >> > >     at org.eclipse.jetty.io.FillInterest.fillable(
> >> >> FillInterest.java:95)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> >> >> > SelectChannelEndPoint.java:93)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> >> >> > executeProduceConsume(ExecuteProduceConsume.java:303)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> >> >> > produceConsume(ExecuteProduceConsume.java:148)
> >> >> > >     at
> >> >> > >
> >org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
> >> >> > ExecuteProduceConsume.java:136)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
> >> >> > QueuedThreadPool.java:671)
> >> >> > >     at
> >> >> > > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(
> >> >> > QueuedThreadPool.java:589)
> >> >> > >     at java.lang.Thread.run(Thread.java:748)
> >> >> > > Caused by: java.nio.file.FileSystemException:
> >> >> > > /local/var/solr/shard2/filesearch/data/index/_34w5.fdx: Too
> >many
> >> >open
> >> >> > files
> >> >> > >     at
> >> >> > >
> >>
> >>sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
> >> >> > >     at sun.nio.fs.UnixException.rethrowAsIOException(
> >> >> > UnixException.java:102)
> >> >> > >     at sun.nio.fs.UnixException.rethrowAsIOException(
> >> >> > UnixException.java:107)
> >> >> > >     at
> >> >> > > sun.nio.fs.UnixFileSystemProvider.newByteChannel(
> >> >> > UnixFileSystemProvider.java:214)
> >> >> > >     at
> >> >> > > java.nio.file.spi.FileSystemProvider.newOutputStream(
> >> >> > FileSystemProvider.java:434)
> >> >> > >     at java.nio.file.Files.newOutputStream(Files.java:216)
> >> >> > >     at
> >> >> > > org.apache.lucene.store.FSDirectory$FSIndexOutput.&lt;
> >> >> > init&gt;(FSDirectory.java:413)
> >> >> > >     at
> >> >> > > org.apache.lucene.store.FSDirectory$FSIndexOutput.&lt;
> >> >> > init&gt;(FSDirectory.java:409)
> >> >> > >     at
> >> >> > >
> >>
> >>org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
> >> >> > >     at
> >> >> > >
> >> >org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(
> >> >> > LockValidatingDirectoryWrapper.java:44)
> >> >> > >     at
> >> >> > > org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(
> >> >> > TrackingDirectoryWrapper.java:43)
> >> >> > >     at
> >> >> > >
> >> >org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.
> >> >> > &lt;init&gt;(CompressingStoredFieldsWriter.java:114)
> >> >> > >     at
> >> >> > >
> >> >org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.
> >> >> > fieldsWriter(CompressingStoredFieldsFormat.java:128)
> >> >> > >     at
> >> >> > > org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.
> >> >> > fieldsWriter(Lucene50StoredFieldsFormat.java:183)
> >> >> > >     at
> >> >> > >
> >> >org.apache.lucene.index.StoredFieldsConsumer.initStoredFieldsWriter(
> >> >> > StoredFieldsConsumer.java:39)
> >> >> > >     at
> >> >> > > org.apache.lucene.index.StoredFieldsConsumer.startDocument(
> >> >> > StoredFieldsConsumer.java:46)
> >> >> > >     at
> >> >> > >
> >org.apache.lucene.index.DefaultIndexingChain.startStoredFields(
> >> >> > DefaultIndexingChain.java:364)
> >> >> > >     at
> >> >> > > org.apache.lucene.index.DefaultIndexingChain.processDocument(
> >> >> > DefaultIndexingChain.java:398)
> >> >> > >     at
> >> >> > >
> >org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(
> >> >> > DocumentsWriterPerThread.java:232)
> >> >> > >     at
> >> >> > > org.apache.lucene.index.DocumentsWriter.updateDocument(
> >> >> > DocumentsWriter.java:478)
> >> >> > >     at
> >> >> > > org.apache.lucene.index.IndexWriter.updateDocument(
> >> >> > IndexWriter.java:1571)
> >> >> > >     at
> >> >> > > org.apache.solr.update.DirectUpdateHandler2.updateDocument(
> >> >> > DirectUpdateHandler2.java:924)
> >> >> > >     at
> >> >> > >
> >org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(
> >> >> > DirectUpdateHandler2.java:913)
> >> >> > >     at
> >> >> > > org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(
> >> >> > DirectUpdateHandler2.java:302)
> >> >> > >     at
> >> >> > > org.apache.solr.update.DirectUpdateHandler2.addDoc0(
> >> >> > DirectUpdateHandler2.java:239)
> >> >> > >     at
> >> >> > > org.apache.solr.update.DirectUpdateHandler2.addDoc(
> >> >> > DirectUpdateHandler2.java:194)
> >> >> > >     at
> >> >> > >
> >org.apache.solr.update.processor.RunUpdateProcessor.processAdd(
> >> >> > RunUpdateProcessorFactory.java:67)
> >> >> > >     at
> >> >> > >
> >> >org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(
> >> >> > UpdateRequestProcessor.java:55)
> >> >> > >     at
> >> >> > > org.apache.solr.update.processor.DistributedUpdateProcessor.
> >> >> versionAdd(
> >> >> > DistributedUpdateProcessor.java:1005)
> >> >> > >     at
> >> >> > > org.apache.solr.update.processor.DistributedUpdateProcessor.
> >> >> processAdd(
> >> >> > DistributedUpdateProcessor.java:748)
> >> >> > >     at
> >> >> > > org.apache.solr.update.processor.LogUpdateProcessorFactory$
> >> >> >
> >LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> >> >> > >     at
> >> >> > >
> >> >org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(
> >> >> > UpdateRequestProcessor.java:55)
> >> >> > >     at
> >> >> > >
> >org.apache.solr.update.processor.LanguageIdentifierUpdateProces
> >> >> > sor.processAdd(LanguageIdentifierUpdateProcessor.java:205)
> >> >> > >     at
> >> >> > >
> >>
> >>org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:
> >> >> > 261)
> >> >> > >     at org.apache.solr.handler.loader.XMLLoader.load(
> >> >> XMLLoader.java:188)
> >> >> > >     at
> >> >> > > org.apache.solr.handler.UpdateRequestHandler$1.load(
> >> >> > UpdateRequestHandler.java:97)
> >> >> > >     at
> >> >> > >
> >> >org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(
> >> >> > ContentStreamHandlerBase.java:68)
> >> >> > >     ... 33 more
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > Thanks
> >> >> > > Nawab
> >> >> >
> >> >> >
> >>
> >>---------------------------------------------------------------------
> >> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> > For additional commands, e-mail:
> >java-user-help@lucene.apache.org
> >> >> >
> >> >> >
> >> >>
> >>
> >> --
> >> Uwe Schindler
> >> Achterdiek 19, 28357 Bremen
> >> https://www.thetaphi.de
>
> --
> Uwe Schindler
> Achterdiek 19, 28357 Bremen
> https://www.thetaphi.de
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message