tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (TIKA-1703) Can't Specify Tesseract Data Folder Distinct from Tesseract Executable Path
Date Tue, 04 Aug 2015 03:21:04 GMT

     [ https://issues.apache.org/jira/browse/TIKA-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris A. Mattmann reassigned TIKA-1703:
---------------------------------------

    Assignee: Chris A. Mattmann

> Can't Specify Tesseract Data Folder Distinct from Tesseract Executable Path
> ---------------------------------------------------------------------------
>
>                 Key: TIKA-1703
>                 URL: https://issues.apache.org/jira/browse/TIKA-1703
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.9
>            Reporter: Christian Wolfe
>            Assignee: Chris A. Mattmann
>            Priority: Minor
>             Fix For: 1.9
>
>
> If a user specifies the path to the Tesseract executable using {{TesseractOCRConfig.setTesseractPath}},
then Tika will assume that the Tesseract config folder (usually referred to as the 'tessdata'
folder) is in the same location. This is usually true in a Windows environment, where everything
is installed into a central location.
> However, this is not necessarily the case in a Linux environment. If one were to build
Tesseract from source, for example, the config folder will be installed in a different location
than the Tesseract executable.
> One way to fix this would be to add a way to specify the location of the Tesseract config
folder separate from the path to the executable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message