tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-2725) Make tika-server robust against ooms/infinite loops/memory leaks
Date Thu, 06 Sep 2018 12:27:00 GMT
Tim Allison created TIKA-2725:

             Summary: Make tika-server robust against ooms/infinite loops/memory leaks
                 Key: TIKA-2725
                 URL: https://issues.apache.org/jira/browse/TIKA-2725
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison
            Assignee: Tim Allison

Currently, tika-server is vulnerable to ooms, inifinite loops and memory leaks.  I see two
ways of making it robust:

1) use the ForkParser
2) have tika-server spawn a child process that actually runs the server, put a watcher thread
in the child that will kill the child on oom/timeout/after x files.  The parent process can
then restart the child if it dies. 

I somewhat prefer 2) so that we don't have to doubly pass the inputstream.  I propose 2),
and I propose making it optional in Tika 1.x, but then the default in Tika 2.x.  We could
also add a status ping from parent to child in case the child gets caught up in stop the world
gc (h/t [~bleskes]).

Other options/recommendations?

This message was sent by Atlassian JIRA

View raw message