tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (TIKA-193) PDFParser adds mime-type twice
Date Wed, 27 May 2009 18:47:45 GMT

     [ https://issues.apache.org/jira/browse/TIKA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jukka Zitting resolved TIKA-193.

       Resolution: Fixed
    Fix Version/s: 0.4
         Assignee: Jukka Zitting

Patch committed in revision 779269, thanks! Resolving as Fixed.

Re: Setting the type only in AutoDetectParser
there are cases where the specific parser classes are used directly, and even in those cases
it would be useful to have the content type metadata set. Also, in some cases the specific
parser implementation may have more information than AutoDetectParser and can thus provide
a more accurate content type.

> PDFParser adds mime-type twice
> ------------------------------
>                 Key: TIKA-193
>                 URL: https://issues.apache.org/jira/browse/TIKA-193
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.3
>            Reporter: Jonathan Koren
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 0.4
>         Attachments: patch
> Using AutoDetectParser to call PDFParser causes the mime-type to be added twice.  It
should be added exactly once.
> Proposed Fix:
> parser/pdf/PDFParser.java should be changed from:
> metadata.add(Metadata.CONTENT_TYPE, "application/pdf");
> to:
> metadata.set(Metadata.CONTENT_TYPE, "application/pdf");
> as per other Tika bundled parsers.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message