tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jana, Kumar Raja" <kj...@ptc.com>
Subject RE: Tika Issue
Date Fri, 13 Feb 2009 12:39:49 GMT
Hi Amardeep,

Tika does not support Office 2007 documents as yet. .xlsx documents get
parsed as zip files and there is a lot of junk/unnecessary stuff thrown
in. Check out Tika-152. There is a patch already submitted but is not
yet integrated. If you plan to patch ur version with the fix submitted
then keep in mind that the Tika config and mime-types xml files need to
be updated properly.


-----Original Message-----
From: amardeep singh khera [mailto:amardeepsinghkhera@gmail.com] 
Sent: Friday, February 13, 2009 5:29 PM
To: tika-dev@lucene.apache.org
Subject: Tika Issue

Hi Tika-dev,

I am facing a problem right now while using tika to browse a xlxs. Th
is  Iam able to extract the content of the xlsx but for some reason I
to read the xlsx cell wise. I am using AutoDetectParser for this purpose
not able to find a way to browse through the xlsx cell wise. Please Help

Amardeep Singh Khera

View raw message