tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (3980)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: Tesseract OCR always activeated parser for images
Date Tue, 07 Oct 2014 14:55:14 GMT
I¹ll try and combine mine and Tyler¹s patch for 1422 and see if it
fixes it :) Will test today.

Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA

-----Original Message-----
From: Tyler Palsulich <tpalsulich@gmail.com>
Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
Date: Tuesday, October 7, 2014 at 1:49 AM
To: "dev@tika.apache.org" <dev@tika.apache.org>
Subject: Re: Tesseract OCR always activeated parser for images

>Confirmed. This is why we ran into TIKA-1422. But, Chris' patch may
>the backwards compatibility you're looking for. What do you think?
>On Mon, Oct 6, 2014 at 7:47 PM, Lewis John Mcgibbney <
>lewis.mcgibbney@gmail.com> wrote:
>> Hi Folks,
>> Now, once I install Tesseract, it is run for every image I pass through
>> Tika server or Tika app.
>> This is not okay as it does not give me the type of MD I am looking for.
>> This is a just a note to folks, to say that AFAIK you would need to
>> unregister the the parser from [0] then rebuild from source in order to
>> maintain backwards compatability in this regard.
>> Before I log a ticket for this, can anyone else confirm this please?
>> Thanks
>> Lewis
>> [0]
>> --
>> *Lewis*

View raw message