tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enrico Donelli (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-654) Resources not properly closed
Date Thu, 05 May 2011 07:32:03 GMT
Resources not properly closed
-----------------------------

                 Key: TIKA-654
                 URL: https://issues.apache.org/jira/browse/TIKA-654
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.0
         Environment: Tested on OSX and Linux debian
            Reporter: Enrico Donelli


We have a thread which parser > 200k files, and we always get "too many open files open"
error from operating system. Using lsof I noticed tha apache-tika temp files (created by class
temporaryFiles) are not really deleted by operating system, even if delete method returns
true.
Searching in the code, I found that the problem (which does not manifest with all the files)
is probably in TikaInputStream#close method. Here opencontainer is set to null, but in case
of opencontainer instance of org.apache.poi.poifs.filesystem.NPOIFSFileSystem the problems
disappear if I call close() on opencontainer. I modified the NPOIFSFileSystem class to implement
java.io.Closeable, and modified TikaInputStream#close method to make 

	if (openContainer instanceof java.io.Closeable) {
			((java.io.Closeable) openContainer).close();
		}
        openContainer = null;

I don't know if this is the best solution, but it seems to solve the problem for me.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message