tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-102) Parser implementations loading a large amount of content into a single String could be problematic
Date Tue, 20 Nov 2007 14:40:45 GMT

    [ https://issues.apache.org/jira/browse/TIKA-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543907
] 

Jukka Zitting commented on TIKA-102:
------------------------------------

+1!

I would rather have the Appendable->ContentHandler mapping happening through an adapter
class for maximum separation of concerns, but that's a minor issue.

> Parser implementations loading a large amount of content into a single String could be
problematic
> --------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-102
>                 URL: https://issues.apache.org/jira/browse/TIKA-102
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Niall Pemberton
>         Attachments: TIKA-102-appendable-v1.patch
>
>
> A Number of the parser implementations create one large String of the content and then
pass it to the ContentHandler. Would be better to just write to the ContentHandler its parsed.
Attaching a patch which changes the parsing to write to an "Appendable" (which java.io.Writer
implements since JDK 1.5) and changes to XMLContentHandler to also implement Appendable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message