tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser
Date Fri, 19 Dec 2014 06:00:22 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253009#comment-14253009
] 

Hudson commented on TIKA-1445:
------------------------------

SUCCESS: Integrated in tika-trunk-jdk1.6 #355 (See [https://builds.apache.org/job/tika-trunk-jdk1.6/355/])
Temporary workaround for TIKA-1445 for Tika 1.7 - always pass the image to the regular parser
to get the metadata set. Will be replaced in 1.8 with composite parsers + user selected config
with strategy (nick: http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1646624)
* /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRParser.java


> Figure out how to add Image metadata extraction to Tesseract parser
> -------------------------------------------------------------------
>
>                 Key: TIKA-1445
>                 URL: https://issues.apache.org/jira/browse/TIKA-1445
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Chris A. Mattmann
>            Assignee: Chris A. Mattmann
>             Fix For: 1.8
>
>         Attachments: TIKA-1445.Mattmann.101214.patch.txt, TIKA-1445.Palsulich.102614.patch,
TIKA-1445_tallison_20141027.patch.txt, TIKA-1445_tallison_v2_20141027.patch, TIKA-1445_tallison_v3_20141027.patch
>
>
> Now that Tesseract is the default image parser in Tika for many image types, consider
how to add back in the metadata extraction capabilities by the other Image parsers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message