[ https://issues.apache.org/jira/browse/TIKA-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886174#action_12886174
]
Chris A. Mattmann commented on TIKA-459:
----------------------------------------
+1, Ken this looks good to me.
Cheers,
Chris
> Improve handling of incorrect charset names in HTTP response header
> -------------------------------------------------------------------
>
> Key: TIKA-459
> URL: https://issues.apache.org/jira/browse/TIKA-459
> Project: Tika
> Issue Type: Improvement
> Reporter: Ken Krugler
> Assignee: Ken Krugler
> Priority: Minor
> Attachments: TIKA-459.patch
>
>
> While crawling a few million pages, I collected stats for charset names that weren't
valid.
> The attached patch "fixes up" most of these that I encountered, and thus should improve
the accuracy of parse results.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
|