tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Burch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1182) Out of memory exception when parsing TTF file
Date Tue, 22 Oct 2013 21:57:46 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802325#comment-13802325
] 

Nick Burch commented on TIKA-1182:
----------------------------------

Your patch would've thrown a NPE if the InputStream wasn't a TikaInputStream. In r1534816
I've added a version that only does the AWT check if we have a TikaInputStream, as that's
the only way we can be sure we can rewind to then use FontBox. That should work for most people
for now, but the proper fix is to get the EOF check into PDFBox, then upgrade our dependency
and remove the extra check from Tika

> Out of memory exception when parsing TTF file
> ---------------------------------------------
>
>                 Key: TIKA-1182
>                 URL: https://issues.apache.org/jira/browse/TIKA-1182
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.4
>         Environment: Ubuntu
> java version "1.7.0_40"
> Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
> Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)
>            Reporter: Erik Hetzner
>         Attachments: 16A4FF_8.ttf, TIKA-1182-fix1.patch, TIKA_1182.java
>
>
>    When parsing attached file using tika-app-1.4.jar, CPU usage is high and it never
seems to finish.
> When parsing using attached java code, I get an out of memory exception.
> Let me know what other information I can provide.
> Thank you!



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message