tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Burch (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-570) If this is a BMP, my name is horatio alger
Date Sun, 12 Dec 2010 05:00:02 GMT

    [ https://issues.apache.org/jira/browse/TIKA-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970579#action_12970579
] 

Nick Burch commented on TIKA-570:
---------------------------------

Reading http://en.wikipedia.org/wiki/BMP_file_format I'm not sure what else we can be sure
to find, but I'm tempted to say we also require either "00 00" or "00 00 00" inside the first
few KB - a text file shouldn't have that many nulls, but most bitmaps will.

> If this is a BMP, my name is horatio alger
> ------------------------------------------
>
>                 Key: TIKA-570
>                 URL: https://issues.apache.org/jira/browse/TIKA-570
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.8
>            Reporter: Benson Margulies
>         Attachments: C80A5295-EFC7-44DD-9A39-B882D1EC6F38.txt, C80A5295-EFC7-44DD-9A39-B882D1EC6F38.txt
>
>
> I am attaching a file which Tika is identifying as a bmp. It contains ordinary text.
>  
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.image.ImageParser@20a19811
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
> 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137)
> 	at com.basistech.jug.FileHarvester.process(FileHarvester.java:204)
> 	at com.basistech.jug.FileHarvester.harvestDir(FileHarvester.java:165)
> 	at com.basistech.jug.FileHarvester.harvestDir(FileHarvester.java:179)
> 	at com.basistech.jug.FileHarvester.harvest(FileHarvester.java:135)
> 	at com.basistech.jug.FileHarvester.run(FileHarvester.java:247)
> 	at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.RuntimeException: New BMP version not implemented yet.
> 	at com.sun.imageio.plugins.bmp.BMPImageReader.readHeader(BMPImageReader.java:462)
> 	at com.sun.imageio.plugins.bmp.BMPImageReader.getWidth(BMPImageReader.java:174)
> 	at org.apache.tika.parser.image.ImageParser.parse(ImageParser.java:75)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
> 	... 8 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message