tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Boris Naguet (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-1175) MS Money files wrongly detected as True Type Font
Date Tue, 01 Oct 2013 09:46:23 GMT
Boris Naguet created TIKA-1175:

             Summary: MS Money files wrongly detected as True Type Font
                 Key: TIKA-1175
                 URL: https://issues.apache.org/jira/browse/TIKA-1175
             Project: Tika
          Issue Type: Bug
          Components: mime
    Affects Versions: 1.4, 1.3
            Reporter: Boris Naguet
            Priority: Minor

TTF magic is probably not specific enough, because it incorrectly detect MS Money files as
TTF files, and then the parsing generates an Exception.
Caused by: ! java.io.IOException: head is mandatory
! at org.apache.fontbox.ttf.AbstractTTFParser.parseTables(AbstractTTFParser.java:107) 

Here is the magic detection code that I added to {{custom-mimetypes.xml}}, and solves it:

	<mime-type type="application/x-msmoney">
		<glob pattern="*.mny" />
		<magic priority="60">
			<match value="0x000100004D534953414D204461746162617365" type="string" offset="0" />
It can replace the existing {{application/x-msmoney}} empty mime-type in {{tika-mimetypes.xml}}.

magic comes from

This message was sent by Atlassian JIRA

View raw message