nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sami Siren (JIRA)" <>
Subject [jira] Commented: (NUTCH-781) Update Tika to v0.6 for the MimeType detection
Date Tue, 02 Feb 2010 09:48:19 GMT


Sami Siren commented on NUTCH-781:

the version we had was the same as the one provided by Tika 0.4 so I suppose we could safely
rely on theTika defaults. MimeUtil currently requires needs tika-mimetypes.xml to be in the
available in the classpath but we could modify that so that it uses the default version from
the tika jar if nothing can be found in conf. Let's put that in a separate JIRA issue if we
really want it, in the meantime I'll commit the v 0.6 of tika-mimetypes.xml

ok. thanks.

> Update Tika to v0.6  for the MimeType detection
> -----------------------------------------------
>                 Key: NUTCH-781
>                 URL:
>             Project: Nutch
>          Issue Type: Improvement
>            Reporter: Julien Nioche
>            Assignee: Julien Nioche
>             Fix For: 1.1
> [from annoucement]
> Apache Tika, a subproject of Apache Lucene, is a toolkit for detecting and
> extracting metadata and structured text content from various documents using
> existing parser libraries.
> Apache Tika 0.6 contains a number of improvements and bug fixes. Details can
> be found in the changes file:

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message