tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Guillaumin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-1161) Dates incorrectly extracted from PDF
Date Fri, 16 Aug 2013 07:31:49 GMT
Nicolas Guillaumin created TIKA-1161:
----------------------------------------

             Summary: Dates incorrectly extracted from PDF
                 Key: TIKA-1161
                 URL: https://issues.apache.org/jira/browse/TIKA-1161
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.4
         Environment: Windows 7 64bit, JDK 1.7
            Reporter: Nicolas Guillaumin
            Priority: Minor
         Attachments: WF_16_Youth_Coalition.pdf

Tika incorrectly extracts the date on the attached PDF to 5034-09-24T14:03:00Z, whereas the
actual date on the PDF seems to be 2007-03-01 10:58:57 according to FoxIt reader.

Interestingly PDFBox 1.8.2 is extracting the correct date as well (When using the PDFDebugger
tool)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message