tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Burch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1796) Issues with tika jar and Microsoft documents like doc.,ppt, xls etc
Date Tue, 17 Nov 2015 16:34:10 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008971#comment-15008971

Nick Burch commented on TIKA-1796:

Firstly, please don't post to the dev list, ignore the response then open a jira for the very
same thing!

As explained in response to your post to the dev list, Apache Tika 0.9 is very old now. There
have been lots of fixes since then, including in areas like this

Please upgrade to the most recent version (1.11) and retry

> Issues with tika jar and Microsoft documents like doc.,ppt, xls etc
> -------------------------------------------------------------------
>                 Key: TIKA-1796
>                 URL: https://issues.apache.org/jira/browse/TIKA-1796
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 0.9
>         Environment: UNIX server
>            Reporter: Femi
> We have had a problem with tika-app-0.9.jar when it comes to using Microsoft documents
(we do not have issues with PDFs and images). It creates tika files which are held by our
weblogic java process.
> For example, if one runs the command :- lsof -p 27305|grep deleted
> java      27305  oracle  330r      REG              253,1   295674         68 /tmp/apache-tika-5125182301796025972.tmp
> java      27305  oracle  334r      REG              253,1   272896         69 /tmp/apache-tika-8997882426533237375.tmp
> java      27305  oracle  335r      REG              253,1   295674         78 /tmp/apache-tika-5232377327199509251.tmp
> java      27305  oracle  336r      REG              253,1    45327         43 /tmp/apache-tika-6884061409786039638.tmp
> java      27305  oracle  339r      REG              253,1   272895         41 /tmp/apache-tika-6752501215118342524.tmp
> java      27305  oracle  340r      REG              253,1   272895         41 /tmp/apache-tika-6752501215118342524.tmp
> java      27305  oracle  341r      REG              253,1    45327         75 /tmp/apache-tika-7548218713808428132.tmp
> The above is a long list of held tika files from Microsoft docs in deleted state but
they are still handled by the weblogic process.
> The only way we can get these tika files closed or released is by restarting the weblogic
> This cost us money as we had to stop the server to get rid of the tika files filling
up our tmp folder.
> We have had this issue for almost 3 years now. I have been researching on the web to
see if there are solutions out there in an upgraded tika-jar but it seems there are none.
> I was thinking it will be resolved in an upgraded jar file but it seems that is not the
> Please is there any solution to this issue?
> Regards,
> Femi Balogun,
> Application Support Engineer,

This message was sent by Atlassian JIRA

View raw message