tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-6) Port Nutch (or better) MimeType detection system into Tika
Date Sat, 03 Nov 2007 05:43:00 GMT

    [ https://issues.apache.org/jira/browse/TIKA-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539817
] 

Chris A. Mattmann commented on TIKA-6:
--------------------------------------

Hi Jukka,


+1 for removing freedesktop.org.xml. I think that we should include the
tika-mimetypes.xml file (which is equivalent to the current nutch mime types
xml file, formatted using the freedesktop.org mime type dtd) though, since
it originated from an Apache project. What do you think?


+1 for this: my bad. I thought I had configured Tika within the project
settings for my eclipse, but neglected to double check that. I will fix it
in the latest patch.



+1 for this too. I will update it in the latest patch and reattach.

Otherwise, what do folks think about the code?

Thanks!

Cheers,
  Chris





______________________________________________
Chris Mattmann, Ph.D.
Chris.Mattmann@jpl.nasa.gov
Cognizant Development Engineer
Early Detection Research Network Project

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                     Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.




> Port Nutch (or better) MimeType detection system into Tika
> ----------------------------------------------------------
>
>                 Key: TIKA-6
>                 URL: https://issues.apache.org/jira/browse/TIKA-6
>             Project: Tika
>          Issue Type: New Feature
>          Components: general
>    Affects Versions: 0.1-incubator
>         Environment: Improvement is indep. of environment
>            Reporter: Chris A. Mattmann
>            Assignee: Chris A. Mattmann
>             Fix For: 0.1-incubator
>
>         Attachments: TIKA-6.Mattmann.091907.patch.txt, TIKA-6.Mattmann.092007.patch.txt
>
>
> This patch will contribute a MimeType detection system for Tika, including MImeType data
structure, and associated content-detection facilities. This will be based on Nutch's MimeType
system as a baseline, however, I'm open to suggestions. Jerome Charron mentioned that he had
an implementation of a MimeType system based on FreeDesktop.org's system. We should look into
this as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message