nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Antony Bowesman (JIRA)" <j...@apache.org>
Subject [jira] Updated: (NUTCH-564) External parser supports encoding attribute
Date Wed, 03 Oct 2007 21:50:50 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Antony Bowesman updated NUTCH-564:
----------------------------------

    Attachment: ExtParser_0.9.0.patch

Patch for release 0.9

> External parser supports encoding attribute
> -------------------------------------------
>
>                 Key: NUTCH-564
>                 URL: https://issues.apache.org/jira/browse/NUTCH-564
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 0.9.0
>         Environment: All
>            Reporter: Antony Bowesman
>            Priority: Minor
>             Fix For: 1.0.0
>
>         Attachments: ExtParser_0.9.0.patch
>
>
> When an external component generates text, which is returned to the external parser,
it always converts the text using the default character set.  (os.toString()).  For example,
the returned text may be utf-8, but will not be converted to a String correctly.
> I added the attribute <encoding> to the <implementation> XML in plugin.xml
and this is then used to convert the text.
> I have made my original fix to my local 0.9, but have made a patch based on the trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message