tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Mattmann <mattm...@apache.org>
Subject Re: Apache Tika
Date Wed, 03 May 2017 13:58:09 GMT
Hi Gorka,

 

See: http://wiki.apache.org/tika/TikaOCR/

 

Is that what you’re looking for? If so, then you can simply enable OCR for Tika REST server,
and then
point your TIka Python at that. Does that help?

 

Cheers,

Chris

 

 

 

 

From: gorka gallo <gorkagal@gmail.com>
Date: Wednesday, May 3, 2017 at 2:19 AM
To: "Mattmann, Chris A (3010)" <chris.a.mattmann@jpl.nasa.gov>
Subject: Apache Tika

 

Hi Chris, 

 

I am Gorka Gallo, a research technician from Bilbao, Spain.

 

Is there any method to extract embedded images in PDF files with Apache Tika using Python?

 

Thanks,

 

Best regards,

Gorka.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message