tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henning Gross (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-573) MimeType.getExtension()
Date Wed, 18 May 2011 10:11:47 GMT

    [ https://issues.apache.org/jira/browse/TIKA-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035265#comment-13035265
] 

Henning Gross commented on TIKA-573:
------------------------------------

I got an Use-Case for getting all known valid extensions instead of a "preferred extension".
We need to validate a file in an external component. In this component we have a stream as
an input and no information about the files name. Afterwards the consumer of the component
shall get an information about which mime type was detected and which are the known valid
extensions.

Example: User uploads myImage.jpeg. Tika recognizes the mime type and gives .jpg as extension.
The consumer now checks if .jpg matches the actual extension .jpeg and it does not. The method
getExtension() should be renamed into getPreferredExtension() or better: be removed as there
is no way to know which is the "preferred" extension from the xml but the order. Better implement
public List<String> getExtensions().

I can provide you with a patch today if you like to have that. Can you tell me how soon this
change would be applied to a public snapshot or something like that? We need that in our component
and its kind of a showstopper. We would have to move away from tika if we cannot achieve to
get ALL known valid extensions...

> MimeType.getExtension()
> -----------------------
>
>                 Key: TIKA-573
>                 URL: https://issues.apache.org/jira/browse/TIKA-573
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime
>            Reporter: Maxim Valyanskiy
>             Fix For: 0.9
>
>         Attachments: 0001-TIKA-573-add-MimeType.getExtension.patch, TIKA-573.patch
>
>
> This patch adds getExtension() method to MimeType and support for reading mime-types
from mime.types format.
> I added mime.types file from Fedora Linux, license says that it is public domain file:
> ===
> Red Hat disclaims any copyright on the "mailcap" and "mime-types" files and places them
in the public domain. You are 
> free to do whatever you wish with these files.
> The mailcap.4 man page is under an MIT license:
> Copyright (c) 1991 Bell Communications Research, Inc. (Bellcore)
> Permission to use, copy, modify, and distribute this material
> for any purpose and without fee is hereby granted, provided
> that the above copyright notice and this permission notice
> appear in all copies, and that the name of Bellcore not be
> used in advertising or publicity pertaining to this
> material without the specific, prior written permission
> of an authorized representative of Bellcore.  BELLCORE
> MAKES NO REPRESENTATIONS ABOUT THE ACCURACY OR SUITABILITY
> OF THIS MATERIAL FOR ANY PURPOSE.  IT IS PROVIDED "AS IS",
> WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES.
> Tom Callaway, Fedora Legal, Red Hat
> Thu Sep 17, 2009
> ===
> (we do not need man page, only mime.types file)
> getExtension() method can be used for creating friendly filename for OLE-embedded files,
streams and other cases when name is not known

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message