tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niall Pemberton (JIRA)" <j...@apache.org>
Subject [jira] Updated: (TIKA-102) Parser implementations loading a large amount of content into a single String could be problematic
Date Tue, 20 Nov 2007 06:04:43 GMT

     [ https://issues.apache.org/jira/browse/TIKA-102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Niall Pemberton updated TIKA-102:
---------------------------------

    Attachment: TIKA-102-appendable-v1.patch

> Parser implementations loading a large amount of content into a single String could be
problematic
> --------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-102
>                 URL: https://issues.apache.org/jira/browse/TIKA-102
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Niall Pemberton
>         Attachments: TIKA-102-appendable-v1.patch
>
>
> A Number of the parser implementations create one large String of the content and then
pass it to the ContentHandler. Would be better to just write to the ContentHandler its parsed.
Attaching a patch which changes the parsing to write to an "Appendable" (which java.io.Writer
implements since JDK 1.5) and changes to XMLContentHandler to also implement Appendable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message