tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bob Paulin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1910) Tika 2.0 - Decouple Tika Parser Office Module from Other Dependencies
Date Mon, 28 Mar 2016 13:18:26 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214183#comment-15214183

Bob Paulin commented on TIKA-1910:

     Does this mean that if someone doesn't include the parser package or the web package
on their classpath the code will run but silently fail to work on some content?

Yes.  The goal is if Parser X instantiates Parser Y and Y is not on the classpath then Y will
essentially be a NoOp parser.  I've also add the ability to pass the LoadErrorHandler in to
the proxy so if we want to log or throw an exception that is also possible.  But from a compilation
perspective the dependency is optional.  This also means in the OSGi bundle the dependency
must also be made optional.

And yes. I think we might be able to totally eliminate POI from all the modules except for
office with a little bit of work so that's pretty exciting.  If you compare the new package
parser module's pom to the old one  you'll see how many lines are removed and that doesn't
even include all the transitive dependencies!

> Tika 2.0 - Decouple Tika Parser Office Module from Other Dependencies
> ---------------------------------------------------------------------
>                 Key: TIKA-1910
>                 URL: https://issues.apache.org/jira/browse/TIKA-1910
>             Project: Tika
>          Issue Type: Improvement
>    Affects Versions: 2.0
>            Reporter: Bob Paulin
>            Assignee: Bob Paulin
> Currently the Tika Parser Office Module depends on 
> Tika Parser Web Module
> Tika Parser Package Module
> Tika Parser Text Module
> Using the proxies we can make those dependencies optional so if they are not included
on the classpath the code functions but performs no operation on content that would be parsed
on the optional dependencies.

This message was sent by Atlassian JIRA

View raw message