tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Burch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-788) DWG parser infinite loop on possibly corrupt file
Date Sat, 30 Jun 2012 17:31:06 GMT

    [ https://issues.apache.org/jira/browse/TIKA-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404576#comment-13404576
] 

Nick Burch commented on TIKA-788:
---------------------------------

I've had a go at this in r1355780, changing the logic to skip the header section if the apparent
offset is over 10mb (the header is normally very close to the start of the file, so this shouldn't
affect real files)

What would be good is if we could get one of these problematic files, along with the metadata
that AutoCAD reports for it (ideally the same set of test values that our other sample files
have). We can then hopefully work out how to distinguish the two kinds of files, and how to
find the metadata in these ones.
                
> DWG parser infinite loop on possibly corrupt file
> -------------------------------------------------
>
>                 Key: TIKA-788
>                 URL: https://issues.apache.org/jira/browse/TIKA-788
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.0
>            Reporter: Stas Shaposhnikov
>
> When parsing some dwg items, it is possible that the parser may cause itself to go into
an infinite loop.
> Attached is the file causing the problem.
> Here is a possible patch that will at least proceed until an error is thrown.
> {noformat}
> === modified file 'tika-parsers/src/main/java/org/apache/tika/parser/dwg/DWGParser.java'
> --- tika-parsers/src/main/java/org/apache/tika/parser/dwg/DWGParser.java        2011-11-24
11:30:33 +0000
> +++ tika-parsers/src/main/java/org/apache/tika/parser/dwg/DWGParser.java        2011-11-25
05:27:41 +0000
> @@ -274,8 +274,10 @@
>              return false;
>          }
>          while (toSkip > 0) {
> -            byte[] skip = new byte[Math.min((int) toSkip, 0x4000)];
> -            IOUtils.readFully(stream, skip);
> +            byte[] skip = new byte[(int) Math.min(toSkip, 0x4000)];
> +            if (IOUtils.readFully(stream, skip) == -1) {
> +               return false; //invalid skip
> +            }
>              toSkip -= skip.length;
>          }
>          return true;
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message