tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David vandendriessche (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-1030) Page extraction for Word,Excel Documents
Date Fri, 23 Nov 2012 14:22:58 GMT
David vandendriessche created TIKA-1030:

             Summary: Page extraction for Word,Excel Documents
                 Key: TIKA-1030
                 URL: https://issues.apache.org/jira/browse/TIKA-1030
             Project: Tika
          Issue Type: Improvement
         Environment: For use with Solr
            Reporter: David vandendriessche

I would like to extract pages from word doc's and excel sheets. 

Reason: I'm using solr to search files and give page hit results. For this I used pdfbox for
page extraction. Now I would like to upload other doctypes but I can't seem to find paging
support for it.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message