tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bertrand Delacretaz (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-6) Port Nutch (or better) MimeType detection system into Tika
Date Thu, 20 Sep 2007 16:27:31 GMT

    [ https://issues.apache.org/jira/browse/TIKA-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529143

Bertrand Delacretaz commented on TIKA-6:

Chris suggests (on the dev list):

> ...I propose
> that we remove the file freedesktop.org.xml, and then rename the DTD file
> from freedesktop.org.dtd to mime.types.dtd....

If that DTD file was part of the freedesktop package, this would not be ok IMHO, as the whole
package is covered by the GPL license.

OTOH, if the DTD was "independently created", reusing the same data model for our mime-type
database is probably ok (do we want to confirm this?)

Sorry to be picky about that, but there were some interesting discussions about similar cases
at the ASF recently [1], and I much prefer to write code than having to handle stuff like
this ;-)

[1] http://mail-archives.apache.org/mod_mbox/www-legal-discuss/200708.mbox/browser

> Port Nutch (or better) MimeType detection system into Tika
> ----------------------------------------------------------
>                 Key: TIKA-6
>                 URL: https://issues.apache.org/jira/browse/TIKA-6
>             Project: Tika
>          Issue Type: New Feature
>          Components: general
>    Affects Versions: 0.1-incubator
>         Environment: Improvement is indep. of environment
>            Reporter: Chris A. Mattmann
>            Assignee: Chris A. Mattmann
>             Fix For: 0.1-incubator
>         Attachments: TIKA-6.Mattmann.091907.patch.txt
> This patch will contribute a MimeType detection system for Tika, including MImeType data
structure, and associated content-detection facilities. This will be based on Nutch's MimeType
system as a baseline, however, I'm open to suggestions. Jerome Charron mentioned that he had
an implementation of a MimeType system based on FreeDesktop.org's system. We should look into
this as well.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message