tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2293) Tess4jOCRParser - A simpler Java version of TesseractOCRParser
Date Thu, 04 May 2017 15:00:08 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996866#comment-15996866

ASF GitHub Bot commented on TIKA-2293:

tballison commented on issue #158: TIKA-2293 - Tess4jOCRParser - A simpler Java version of
URL: https://github.com/apache/tika/pull/158#issuecomment-299211242
   See the discussion here: https://issues.apache.org/jira/browse/TIKA-2293 .  I think there's
consensus that this doesn't buy us enough and actually adds some complexity to our current
setup.  I proposed moving this into a standalone project/parser that we can mention.
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

>  Tess4jOCRParser - A simpler Java version of TesseractOCRParser
> ---------------------------------------------------------------
>                 Key: TIKA-2293
>                 URL: https://issues.apache.org/jira/browse/TIKA-2293
>             Project: Tika
>          Issue Type: Improvement
>          Components: ocr
>            Reporter: Thejan Wijesinghe
>             Fix For: 1.15
> Right now, TesseractOCRParser calls tesseract and imagemagick from command line. Intention
of this new parser "Tess4jOCRParser" is to use the Tess4J API instead of the runtime.exec
way to executing tesseract out of process.  

This message was sent by Atlassian JIRA

View raw message