tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Ott (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-697) Tika reports the content type of AR archives as "text/plain"
Date Mon, 07 Nov 2011 09:53:52 GMT

    [ https://issues.apache.org/jira/browse/TIKA-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145308#comment-13145308

Alex Ott commented on TIKA-697:

I think, that following magic in tika-mimetypes.xml will be enough (instead of modifying code
of Tika):

  <mime-type type="application/x-unix-archive">
    <magic priority="50">
      <match value="0x213C617263683E0A" type="string" offset="0" />
    <glob pattern="*.a"/>

> Tika reports the content type of AR archives as "text/plain"
> ------------------------------------------------------------
>                 Key: TIKA-697
>                 URL: https://issues.apache.org/jira/browse/TIKA-697
>             Project: Tika
>          Issue Type: Bug
>         Environment: Linux (CentOS 5.6)
>            Reporter: PNS
>            Priority: Trivial
> The Tika.detect(InputStream) method returns "text/plain" for AR archives created with
the Linux "Create Archive" option of Nautilus (available via right-clicking on a file).
> The Apache Commons Compress "autodetection" code of the ArchiveStreamFactory looks at
the first 12 bytes of the stream and correctly identifies the type as AR.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message