tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bob Paulin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1507) Under OSGi, ForkParser failes to send core parser classes like ExternalParser
Date Sun, 12 Jul 2015 14:45:05 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623835#comment-14623835
] 

Bob Paulin commented on TIKA-1507:
----------------------------------

I unzipped the tika-parser jar I noticed that the org.apache.tika.parser.external package
is not listed in the MANIFEST.MF Import-Package entry.  This means that OSGi will not load
the ExternalParser class into the tika-parser classloader.  This will lead to the NoClassDefFoundError.
 The package is being exported by the tika-core project's MANIFEST.MF so I can't think of
a reason why the maven-bundle-plugin would not pick it up as an import for tika-parser.  For
this I've filed a bug against the maven-bundle-plugin project to see if they have any thoughts:
https://issues.apache.org/jira/browse/FELIX-4958

One possible workaround is an explicit Export-Package statement in the pom of the parser project.
 Exported packages are automatically included as imports in the plugin so I found adding the
attached patch allows the proper classloading to take place.  However I have it to export
all the classes under org.apache.tika.parser.* which will re-export the classes from the tika-core
bundle under the tika-parser bundle.  This could cause other bundles that use the org.apache.tika.parser.*
to import these packages from tika-parser instead of tika-core.  It's the same classes so
it's harmless but a bit odd. 

I've attached a new patch with this update to the pom.

> Under OSGi, ForkParser failes to send core parser classes like ExternalParser
> -----------------------------------------------------------------------------
>
>                 Key: TIKA-1507
>                 URL: https://issues.apache.org/jira/browse/TIKA-1507
>             Project: Tika
>          Issue Type: Bug
>          Components: packaging, parser
>    Affects Versions: 1.6, 1.7
>            Reporter: Nick Burch
>
> Under OSGi, if you try to use ForkParser with the Tesseract OCR parser, it will fail
with:
> java.lang.NoClassDefFoundError: org/apache/tika/parser/external/ExternalParser
> 	at org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117)
> 	at org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:91)
> 	at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
> 	at org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95)
> 	at org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:622)
> 	at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
> 	at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
> 	at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.ClassNotFoundException: Unable to find class org.apache.tika.parser.external.ExternalParser
> 	at org.apache.tika.fork.ClassLoaderProxy.findClass(ClassLoaderProxy.java:117)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
> 	... 13 more
> ExternalParser lives in the Tika Core jar, not the Tika Parsers one. This all works fine
outside of OSGi, so it looks like something about the OSGi bundling is causing the fork parser
to fail to send the parser-related classes from Tika Core over to the forked JVM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message