nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <>
Subject [jira] Updated: (NUTCH-618) Tika error "Media type alias already exists"
Date Mon, 02 Jun 2008 04:21:45 GMT


Chris A. Mattmann updated NUTCH-618:

    Attachment: NUTCH-618.Mattmann.patch.060108.2.txt

Updated patch that includes the updates to tika-mimetypes.xml identified by Dennis Kubes.
Thanks, Dennis!

Dennis tested this on his testbed environment and it ran through great. So, I'd like to call
for 24-48 hr review on the patch, and then if no objections, I'd like to commit it.



> Tika error "Media type alias already exists"
> --------------------------------------------
>                 Key: NUTCH-618
>                 URL:
>             Project: Nutch
>          Issue Type: Bug
>          Components: mime_type_detector
>    Affects Versions: 1.0.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Chris A. Mattmann
>         Attachments: NUTCH-618.Mattmann.patch.060108.2.txt, NUTCH-618.Mattmann.patch.060108.txt
>          Time Spent: 2h
>  Remaining Estimate: 0h
> After the upgrade to the latest Tika jar we see a lot of errors like this:
> 2008-03-06 08:07:20,659 WARN org.apache.tika.mime.MimeTypesReader: Invalid media type
alias: text/xml
> org.apache.tika.mime.MimeTypeException: Media type alias already exists: text/xml
> 	at org.apache.tika.mime.MimeTypes.addAlias(
> 	at org.apache.tika.mime.MimeType.addAlias(
> 	at org.apache.tika.mime.MimeTypesReader.readMimeType(
> 	at
> 	at
> 	at org.apache.tika.mime.MimeTypesFactory.create(
> 	at org.apache.nutch.util.MimeUtil.(
> 	at org.apache.nutch.protocol.Content.(
> 	at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(
> 	at org.apache.nutch.fetcher.Fetcher2$
> This is caused most likely by the duplicate tika-mimetypes.xml file - one copy is embedded
inside the Tika jar, the other is found in Nutch conf/ directory. The one inside the jar seems
to be more recent, so I propose to simply remove the one we have in conf.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message