tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Damiano (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1396) Embedded images in PDF documents
Date Fri, 15 Aug 2014 00:39:18 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097978#comment-14097978
] 

Damiano commented on TIKA-1396:
-------------------------------

Hello Tim! Where can I download Tika 1.6? I will test this version...
When you wrote "you'll need to turn that on via the config file for the PDFParser" do you
mean that i have to pass the configuration file during the executing of the Tika JAR ?

At the moment I only use Tika jar. Let me know.
Thanks

> Embedded images in PDF documents
> --------------------------------
>
>                 Key: TIKA-1396
>                 URL: https://issues.apache.org/jira/browse/TIKA-1396
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.5
>         Environment: *OS:* 
> Ubuntu 14.04.1 LTS
> *KERNEL:*
> 3.13.0-33-generic 
> gcc version 4.8.2
> *JAVA:*
> java version "1.8.0_11"
> Java(TM) SE Runtime Environment (build 1.8.0_11-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.11-b03, mixed mode)
>            Reporter: Damiano
>            Priority: Critical
>
> Hello!
> I just found a problem with PDF documents that have embedded images.
> Doing:
> java -jar tika-app-1.5.jar --extract tika.pdf
> Tika can not find the image.
> Is this a PDF related problem? Because if i do the same operation with a DOC document
Tika finds the image correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message