manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kishore Kumar <kishorejan...@live.com>
Subject Fwd: ManifoldCF
Date Wed, 03 May 2017 09:32:01 GMT
Looping manifoldcf mailing list.

KK

________________________________
From: Matei Claudiu <claudiu.matei@optis.be>
Sent: Wednesday, May 3, 2017 2:57:52 PM
To: kishorejangid@live.com
Cc: Quirynen Jasper
Subject: ManifoldCF

Hi Kishore Kumar,

Thanks for developing ManifoldCF.

I have a question about it. I am trying to use the Windows Share connector together with Tika.
The problem is that after I index some files, I get the following error:

agents process ran out of memory - shutting down
java.lang.OutOfMemoryError: Java heap space
      at java.util.Arrays.copyOf(Arrays.java:3308)
      at java.util.BitSet.ensureCapacity(BitSet.java:337)
      at java.util.BitSet.expandTo(BitSet.java:352)
      at java.util.BitSet.set(BitSet.java:447)
      at de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.characters(BoilerpipeHTMLContentHandler.java:267)
      at org.apache.tika.parser.html.BoilerpipeContentHandler.characters(BoilerpipeContentHandler.java:155)
      at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
      at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)
      at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
      at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
      at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
      at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)
      at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)
      at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)
      at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)
      at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:278)
      at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
      at org.apache.tika.sax.xpath.MatchingContentHandler.characters(MatchingContentHandler.java:85)
      at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
      at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
      at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
      at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)
      at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
      at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
      at org.ccil.cowan.tagsoup.Parser.pcdata(Parser.java:994)
      at org.ccil.cowan.tagsoup.HTMLScanner.scan(HTMLScanner.java:482)
      at org.ccil.cowan.tagsoup.Parser.parse(Parser.java:449)
      at org.apache.tika.parser.code.SourceCodeParser.parse(SourceCodeParser.java:120)
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
      at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
      at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72)
[Thread-355] INFO org.eclipse.jetty.server.ServerConnector - Stopped ServerConnector@418c5a9c{HTTP/1.1}{0.0.0.0:8345}
[Thread-355] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.w.WebAppContext@387a8303{/mcf-api-service,file:/private/var/folders/nn/w4hqd84d42j6b4g1wdpdzpwr0000gn/T/jetty-0.0.0.0-8345-mcf-api-service.war-_mcf-api-service-any-1139783112420177477.dir/webapp/,UNAVAILABLE}{/Users/claudiu/Optis/Dev/manifoldcf/apache-manifoldcf-2.6-src/dist/example/./../web/war/mcf-api-service.war}
[Thread-355] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.w.WebAppContext@69504ae9{/mcf-authority-service,file:/private/var/folders/nn/w4hqd84d42j6b4g1wdpdzpwr0000gn/T/jetty-0.0.0.0-8345-mcf-authority-service.war-_mcf-authority-service-any-4837742264173809485.dir/webapp/,UNAVAILABLE}{/Users/claudiu/Optis/Dev/manifoldcf/apache-manifoldcf-2.6-src/dist/example/./../web/war/mcf-authority-service.war}

I already have increased the Java memory to 8GB but this doesn’t look like a scalable solution.

I noticed that I don’t get any errors when I exclude Tika.

Do you see a solution for this?

Thank you,

Claudiu Matei

Mime
View raw message