tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Douglas (JIRA)" <j...@apache.org>
Subject [jira] Updated: (TIKA-570) If this is a BMP, my name is horatio alger
Date Sun, 12 Dec 2010 20:00:02 GMT

     [ https://issues.apache.org/jira/browse/TIKA-570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benjamin Douglas updated TIKA-570:
----------------------------------

    Attachment: TIKA-570.patch

I am attaching a patch that encodes the "BM" prefix, the color planes signature, and the possible
bit count values in tika-mimetypes.xml. I believe that since we are checking for the "BM"
magic, this should not conflict with any OS/2 variations, since they have different magic
values, like "BA", "CI", etc.

This patch adds the original text file to the test document set and confirms in the unit test
that it is not detected as a bitmap.

> If this is a BMP, my name is horatio alger
> ------------------------------------------
>
>                 Key: TIKA-570
>                 URL: https://issues.apache.org/jira/browse/TIKA-570
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.8
>            Reporter: Benson Margulies
>         Attachments: C80A5295-EFC7-44DD-9A39-B882D1EC6F38.txt, C80A5295-EFC7-44DD-9A39-B882D1EC6F38.txt,
TIKA-570.patch
>
>
> I am attaching a file which Tika is identifying as a bmp. It contains ordinary text.
>  
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.image.ImageParser@20a19811
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
> 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137)
> 	at com.basistech.jug.FileHarvester.process(FileHarvester.java:204)
> 	at com.basistech.jug.FileHarvester.harvestDir(FileHarvester.java:165)
> 	at com.basistech.jug.FileHarvester.harvestDir(FileHarvester.java:179)
> 	at com.basistech.jug.FileHarvester.harvest(FileHarvester.java:135)
> 	at com.basistech.jug.FileHarvester.run(FileHarvester.java:247)
> 	at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.RuntimeException: New BMP version not implemented yet.
> 	at com.sun.imageio.plugins.bmp.BMPImageReader.readHeader(BMPImageReader.java:462)
> 	at com.sun.imageio.plugins.bmp.BMPImageReader.getWidth(BMPImageReader.java:174)
> 	at org.apache.tika.parser.image.ImageParser.parse(ImageParser.java:75)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
> 	... 8 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message