tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martijn van Groningen (JIRA)" <j...@apache.org>
Subject [jira] Updated: (TIKA-402) Support for Keynote and Pages documents
Date Fri, 14 May 2010 13:30:42 GMT

     [ https://issues.apache.org/jira/browse/TIKA-402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Martijn van Groningen updated TIKA-402:

    Attachment: testKeynote.key

Thanks for adding! Yes I forgot to add the Apache licence, so good you did that. I've updated
my patch. The following has changed:
* The test for pages did not have any assertions. So I added them to match with the test Pages
* I noticed that for Keynote presentations the table wasn't parsed. I fixed that and adjusted
the Keynote test and the testKeynote.key file.

I will work on the number support in the coming days.

> Support for Keynote and Pages documents
> ---------------------------------------
>                 Key: TIKA-402
>                 URL: https://issues.apache.org/jira/browse/TIKA-402
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>         Attachments: iwork.patch, iwork.patch, iwork.patch, iwork.patch, testKeynote.key,
testKeynote.key, testPages.pages
> It would be nice to have support for documents created by Apple's Keynote and Pages applications.
Both file formats are described in http://developer.apple.com/mac/library/documentation/AppleApplications/Conceptual/iWork2-0_XML/Chapter01/Introduction.html.
I'm not sure if there already are open source parser libraries for these formats or if we'd
need to directly process the XML content.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message