tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Burch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1326) MSI file detection
Date Fri, 06 Jun 2014 13:22:02 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019834#comment-14019834
] 

Nick Burch commented on TIKA-1326:
----------------------------------

I was about to say "that can't possibly be right", but it looks like msi files really are
based on the OLE2 document structure, bizarre!

Any chance you could help us find a really small (few kb?) liberally licensed .msi file, which
we can use for tests?

> MSI file detection
> ------------------
>
>                 Key: TIKA-1326
>                 URL: https://issues.apache.org/jira/browse/TIKA-1326
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime
>    Affects Versions: 1.5
>            Reporter: Luis Filipe Nassif
>            Priority: Minor
>
> Please remove *.msi extension from application/x-msdownload mime-type definition, incorrectly
listed there, and add the following mime-type in tika-mimetypes.xml:
> {code}
> <mime-type type="application/x-ms-installer">
>     	<_comment>Microsoft Windows Installer</_comment> 
>     	<sub-class-of type="application/x-tika-msoffice"/>
>     	<glob pattern="*.msi"/>
>     	<glob pattern="*.msp"/>
>     	<glob pattern="*.mst"/>
> </mime-type>
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message