nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lufeng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-1736) Can't fetch page if http response header contains Transfer-Encoding:chunked
Date Mon, 17 Mar 2014 03:31:44 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937426#comment-13937426
] 

lufeng commented on NUTCH-1736:
-------------------------------

Hi ysc

you can check the content size to fix this issue like this. 

{code:java}
if (http.getMaxContent() >= 0 && (contentBytesRead + chunkLen) > http.getMaxContent()
)
  chunkLen= http.getMaxContent() - contentBytesRead;
{code}

> Can't fetch page if http response header contains Transfer-Encoding:chunked
> ---------------------------------------------------------------------------
>
>                 Key: NUTCH-1736
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1736
>             Project: Nutch
>          Issue Type: Bug
>          Components: protocol
>    Affects Versions: 1.6, 2.1, 1.7, 2.2, 2.3, 1.8, 2.4, 1.9, 2.2.1
>            Reporter: ysc
>            Priority: Critical
>             Fix For: 2.3, 1.9
>
>         Attachments: nutch-2.2.1.patch, nutch1.7.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> fetching: http://szs.mof.gov.cn/zhengwuxinxi/zhengcefabu/201402/t20140224_1046354.html
> Fetch failed with protocol status: EXCEPTION: java.io.IOException: unzipBestEffort returned
null



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message