tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arvind Jain (Jira)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2992) java.lang.UnsupportedOperationException: This feature requires ASM7 in Tika 1.21
Date Tue, 03 Dec 2019 05:51:00 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986630#comment-16986630

Arvind Jain commented on TIKA-2992:

Thanks for the reply [~nick].

Tried what you suggested, looks like we only have ASM 7.1 in our classpath because tika-parsers
1.21 requires that.

I debugged a bit more and looks like an issue with [https://github.com/apache/tika/blob/master/tika-parsers/src/main/java/org/apache/tika/parser/asm/XHTMLClassVisitor.java].
This class is initialized with OpCode ASM5.

The exception is happening here: [https://github.com/consulo/objectweb-asm/blob/master/asm/src/main/java/org/objectweb/asm/ClassVisitor.java#L150,|https://github.com/consulo/objectweb-asm/blob/master/asm/src/main/java/org/objectweb/asm/ClassVisitor.java#L150] so
somewhere feature of ASM7 is being used – which is not unexpected since tika-parsers depends
on ASM 7.1.

Does this make sense or am I missing something ?


>  java.lang.UnsupportedOperationException: This feature requires ASM7 in Tika 1.21
> ---------------------------------------------------------------------------------
>                 Key: TIKA-2992
>                 URL: https://issues.apache.org/jira/browse/TIKA-2992
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.21
>            Reporter: Arvind Jain
>            Priority: Major
> We are using Tika java library to parse a bunch of documents (various formats). We are
seeing the exception below regularly in our logs on certain documents. Any suggestions on
how to fix would be really useful. On initial investigation it looks like its a bug with mismatched
ASM between XHTMLClassVisitor and tika-parsers pom. 
> Failed to parse the document. org.apache.tika.exception.TikaException: Failed to parse
a Java class
> at org.apache.tika.parser.asm.XHTMLClassVisitor.parse (XHTMLClassVisitor.java:66)
> at org.apache.tika.parser.asm.ClassParser.parse (ClassParser.java:51)
> at org.apache.tika.parser.CompositeParser.parse (CompositeParser.java:280)
> at org.apache.tika.parser.CompositeParser.parse (CompositeParser.java:280)
> at org.apache.tika.parser.AutoDetectParser.parse (AutoDetectParser.java:143)
> at com.askscio.beam.docbuilder.processor.parsers.GenericParser.parse (GenericParser.java:55)
> <snipped>
> Caused by: java.lang.UnsupportedOperationException: This feature requires ASM7
> at org.objectweb.asm.ClassVisitor.visitNestMember (ClassVisitor.java:236)
> at org.objectweb.asm.ClassReader.accept (ClassReader.java:660)
> at org.objectweb.asm.ClassReader.accept (ClassReader.java:400)
> at org.apache.tika.parser.asm.XHTMLClassVisitor.parse (XHTMLClassVisitor.java:61)}}

This message was sent by Atlassian Jira

View raw message