tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-416) Out-of-process text extraction
Date Tue, 18 Jan 2011 15:50:43 GMT

    [ https://issues.apache.org/jira/browse/TIKA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983233#action_12983233
] 

Chris A. Mattmann commented on TIKA-416:
----------------------------------------

Awesome job Jukka!

> Out-of-process text extraction
> ------------------------------
>
>                 Key: TIKA-416
>                 URL: https://issues.apache.org/jira/browse/TIKA-416
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 0.9
>
>
> There's currently no easy way to guard against JVM crashes or excessive memory or CPU
use caused by parsing very large, broken or intentionally malicious input documents. To better
protect against such cases and to generally improve the manageability of resource consumption
by Tika it would be great if we had a way to run Tika parsers in separate JVM processes. This
could be handled either as a separate "Tika parser daemon" or as an explicitly managed pool
of forked JVMs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message