tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ken Krugler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-340) Provide full Tika bundle
Date Wed, 02 Dec 2009 14:18:20 GMT

    [ https://issues.apache.org/jira/browse/TIKA-340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784839#action_12784839
] 

Ken Krugler commented on TIKA-340:
----------------------------------

Funny, I was just looking at the size of the Hadoop job jar I generate for Bixo. It was suddenly
26MB, and pushing it up to EC2 was taking a long time.

As Jukka's blog post says, it's all about the ooxml-schemas-1.0.jar file - almost 14MB. And
the 2.5MB xmlbeans-2.3.0.jar that this schema jar depends on. Excluding POI would cut about
18MB from my 26MB, which I might need to do (as an option for a smaller build).

> Provide full Tika bundle
> ------------------------
>
>                 Key: TIKA-340
>                 URL: https://issues.apache.org/jira/browse/TIKA-340
>             Project: Tika
>          Issue Type: New Feature
>          Components: packaging
>    Affects Versions: 0.5
>            Reporter: Felix Meschberger
>            Assignee: Jukka Zitting
>             Fix For: 0.6
>
>         Attachments: TIKA-340-2.patch, TIKA-340.patch
>
>
> To easily deploy Tika and especially the Tika parsers, it would be convenient to have
an almost complete bundle consisting of Tika Core, Tika Parsers as well as the most important
parser dependencies. Any remaining dependencies not included with the bundle should be declared
as optional import to not fail bundle resolution if one or the other (or all) import(s) is
missing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message