manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Out of memory, one file bug i think
Date Tue, 24 Jul 2018 11:15:03 GMT
I've opened CONNECTORS-1516 to track the Class Not Found issue, and also
created an Apache POI bugzilla ticket, which is referenced.

Karl


On Tue, Jul 24, 2018 at 6:15 AM Karl Wright <daddywri@gmail.com> wrote:

> The "class not found" error looks probably like a classloader issue with
> Tika -- the class is present in poi-ooxml-3.17.jar, although to be fair it
> might possibly be caused by an out-of-memory condition.
>
> You should be able to find the exception in the Simple History and figure
> out what document it came from from that.  If not, then look at the log
> prior to the exception, and look at what Worker Thread 1 was doing.
>
> Karl
>
>
> On Tue, Jul 24, 2018 at 5:58 AM msaunier <msaunier@citya.com> wrote:
>
>> Re Karl,
>>
>>
>>
>> I have an Out of Memory Error today. I think I have an error with a
>> document. I have this WARNING before crash:
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>> WARN 2018-07-24T11:46:22,098 (Worker thread '1') - Tika: Tika exception
>> extracting: TIKA-198: Illegal IOException from
>> org.apache.tika.parser.microsoft.OfficeParser@62980adb
>>
>> org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException
>> from org.apache.tika.parser.microsoft.OfficeParser@62980adb
>>
>>         at
>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:286)
>> ~[tika-core-1.17.jar:1.17]
>>
>>         at
>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>> ~[tika-core-1.17.jar:1.17]
>>
>>         at
>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>> ~[tika-core-1.17.jar:1.17]
>>
>>         at
>> org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)
>> ~[mcf-tika-connector.jar:?]
>>
>>         at
>> org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235)
>> [mcf-tika-connector.jar:?]
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)
>> [mcf-agents.jar:?]
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)
>> [mcf-agents.jar:?]
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708)
>> [mcf-agents.jar:?]
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)
>> [mcf-agents.jar:?]
>>
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)
>> [mcf-pull-agent.jar:?]
>>
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)
>> [mcf-pull-agent.jar:?]
>>
>>         at
>> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:939)
>> [mcf-jcifs-connector.jar:?]
>>
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
>> [mcf-pull-agent.jar:?]
>>
>> Caused by: java.io.IOException: java.lang.ClassNotFoundException:
>> org.apache.poi.poifs.crypt.agile.AgileEncryptionInfoBuilder
>>
>>         at
>> org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:150)
>> ~[?:?]
>>
>>         at
>> org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:102)
>> ~[?:?]
>>
>>        at
>> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:203)
>> ~[?:?]
>>
>>         at
>> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)
>> ~[?:?]
>>
>>         at
>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>> ~[?:?]
>>
>>         ... 12 more
>>
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.poi.poifs.crypt.agile.AgileEncryptionInfoBuilder
>>
>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> ~[?:1.8.0_171]
>>
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> ~[?:1.8.0_171]
>>
>>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>> ~[?:1.8.0_171]
>>
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> ~[?:1.8.0_171]
>>
>>         at
>> org.apache.poi.poifs.crypt.EncryptionInfo.getBuilder(EncryptionInfo.java:222)
>> ~[?:?]
>>
>>         at
>> org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:148)
>> ~[?:?]
>>
>>         at
>> org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:102)
>> ~[?:?]
>>
>>         at
>> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:203)
>> ~[?:?]
>>
>>         at
>> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)
>> ~[?:?]
>>
>>         at
>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>> ~[?:?]
>>
>>         ... 12 more
>>
>>
>>
>> I think it’s a file, because RAM allocation have a weird behavior. In one
>> second, ManifoldCF (or Tika) allocate +6Go RAM.
>>
>>
>>
>>
>>
>> How Can I find the file?
>>
>>
>>
>> Thanks,
>>
>> Maxence,
>>
>

Mime
View raw message