tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-231) Difference between Web-Site and real working code
Date Fri, 22 May 2009 23:26:45 GMT

    [ https://issues.apache.org/jira/browse/TIKA-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712306#action_12712306
] 

Uwe Schindler commented on TIKA-231:
------------------------------------

sxw & co files from OpenOffice 1.0 are supported (so the pre-release of OpenDocument with
the other sun-specific namespaces). The mapping is done using a SAX filter, that rewrites
the outdated namespaces to the new ones.
The problem is currently only mime-types.conf, that only detects sxw, the other signatures
should be added soon). My idea would be to use a internal catch-all mime-type (like for office)
for all Open Document types. When I am back home, I will prepare a patch.

> Difference between Web-Site and real working code
> -------------------------------------------------
>
>                 Key: TIKA-231
>                 URL: https://issues.apache.org/jira/browse/TIKA-231
>             Project: Tika
>          Issue Type: Bug
>          Components: documentation
>    Affects Versions: 0.3
>         Environment: All
>            Reporter: Karl Heinz Marbaise
>            Priority: Minor
>         Attachments: TIKA-231.patch
>
>
> On the official web site there is written that OpenOffice files will not be scanned or
to be more accurate "TODO", but if i scan a tar.gz / zip archive with open office files their
contents will be extracted. So I think the web site should be updated to document the correct
state of code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message