tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser
Date Mon, 27 Oct 2014 19:25:34 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185657#comment-14185657
] 

Tim Allison commented on TIKA-1445:
-----------------------------------

Agreed. The user needs to limit the parsers considered for metadata parsing, though.  As mentioned
in the post above, both GDAL and Image parsers handle "png".  We can't search all parsers
that handle "png" and pick the first or last.

> Figure out how to add Image metadata extraction to Tesseract parser
> -------------------------------------------------------------------
>
>                 Key: TIKA-1445
>                 URL: https://issues.apache.org/jira/browse/TIKA-1445
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Chris A. Mattmann
>            Assignee: Chris A. Mattmann
>             Fix For: 1.8
>
>         Attachments: TIKA-1445.Mattmann.101214.patch.txt, TIKA-1445.Palsulich.102614.patch,
TIKA-1445_tallison_20141027.patch.txt
>
>
> Now that Tesseract is the default image parser in Tika for many image types, consider
how to add back in the metadata extraction capabilities by the other Image parsers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message