tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-223) PDFParser causes Problems when using encrypted PDF documents
Date Fri, 22 May 2009 21:00:45 GMT

    [ https://issues.apache.org/jira/browse/TIKA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712259#action_12712259

Jukka Zitting commented on TIKA-223:

Sounds reasonable. Do you have a patch for this change?

> PDFParser causes Problems when using encrypted PDF documents
> ------------------------------------------------------------
>                 Key: TIKA-223
>                 URL: https://issues.apache.org/jira/browse/TIKA-223
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.3
>         Environment: Java 1.5.x on MAC, WIN, LIN
>            Reporter: Joachim Zittmayr
>             Fix For: 0.4
>   Original Estimate: 2h
>  Remaining Estimate: 2h
> The PDFParser.parse() method decrypts the document for the metadata already and then
passes it over to PDF2XHTML.process(), which in turn calls the inherited getText(). This calls
writeText(), which tries to decrypt the PDDocument again, but this will fail as it is already
decrypted. The solution would be to override  writeText(), without the document.isEncrypted

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message