tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1442) Upgrade to PDFBox 1.8.8
Date Thu, 23 Oct 2014 19:20:33 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181799#comment-14181799
] 

Tim Allison commented on TIKA-1442:
-----------------------------------

If it is any consolation, the Cyrillic is totally hosed. :)

I'm hoping to get a basic file server set up (thanks to Rackspace) so that I can create hyperlinks
for the source doc and for the extracted text/metadata so that you don't have to go hunting
through the directory structure, and so that you can see what's extracted without running
the app yourself.

That is probably a few weeks off though.

> Upgrade to PDFBox 1.8.8
> -----------------------
>
>                 Key: TIKA-1442
>                 URL: https://issues.apache.org/jira/browse/TIKA-1442
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>             Fix For: 1.7
>
>         Attachments: pdfbox_1_8_6V1_8_8-SNAPSHOT.xlsx, pdfbox_1_8_6V1_8_8-SNAPSHOTb.xlsx,
pdfbox_1_8_6V1_8_8-SNAPSHOTc.xlsx
>
>
> Given the regressions we identified in PDFBox 1.8.7, we should upgrade to 1.8.8 as soon
as it is ready.  I'm tempted to call this a blocker on Tika 1.7.  Let's use this issue to
carry on the discussion of regression testing (if any further discussion is necessary) or
any other prep that needs to happen before 1.8.8's release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message