tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martijn van Groningen (JIRA)" <j...@apache.org>
Subject [jira] Updated: (TIKA-402) Support for Keynote and Pages documents
Date Tue, 04 May 2010 21:01:08 GMT

     [ https://issues.apache.org/jira/browse/TIKA-402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Martijn van Groningen updated TIKA-402:

    Attachment: iwork.patch

Updated the patch. Refactored the patch a bit. I Introduced extractors for each format. I
saw the same for the ms office parser. Currently only Keynote has a working extractor. Pages
and Numbers format support will follow shortly.

> Support for Keynote and Pages documents
> ---------------------------------------
>                 Key: TIKA-402
>                 URL: https://issues.apache.org/jira/browse/TIKA-402
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>         Attachments: iwork.patch, iwork.patch, testKeynote.key
> It would be nice to have support for documents created by Apple's Keynote and Pages applications.
Both file formats are described in http://developer.apple.com/mac/library/documentation/AppleApplications/Conceptual/iWork2-0_XML/Chapter01/Introduction.html.
I'm not sure if there already are open source parser libraries for these formats or if we'd
need to directly process the XML content.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message