tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Guillaumin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (TIKA-1161) Dates incorrectly extracted from PDF
Date Fri, 16 Aug 2013 07:31:49 GMT

     [ https://issues.apache.org/jira/browse/TIKA-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nicolas Guillaumin updated TIKA-1161:
-------------------------------------

    Attachment: WF_16_Youth_Coalition.pdf
    
> Dates incorrectly extracted from PDF
> ------------------------------------
>
>                 Key: TIKA-1161
>                 URL: https://issues.apache.org/jira/browse/TIKA-1161
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.4
>         Environment: Windows 7 64bit, JDK 1.7
>            Reporter: Nicolas Guillaumin
>            Priority: Minor
>              Labels: pdf
>         Attachments: WF_16_Youth_Coalition.pdf
>
>
> Tika incorrectly extracts the date on the attached PDF to 5034-09-24T14:03:00Z, whereas
the actual date on the PDF seems to be 2007-03-01 10:58:57 according to FoxIt reader.
> Interestingly PDFBox 1.8.2 is extracting the correct date as well (When using the PDFDebugger
tool)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message