nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From zeroleaf <zeroleaf...@gmail.com>
Subject Nutch won't fetch the whole page if the Transfer Dncoding is chunked
Date Tue, 16 Sep 2014 14:59:46 GMT
     These days, when I use nutch, I found that if the Transfer Dncoding is
chunked, then nutch will not fetch the whole page and only part of it. 
Is it
right in nutch or is it a bug? If it is right, then how to config to 
fetch the
whole page?

For example, add the url below to seed dir

http://search.dangdang.com/?key=%CA%FD%BE%DD%BF%E2

then, find fetched html in content, will find it is only a part. In 
addition, the
version I test is Nutch 1.x(1.9 and 1.10).

Thanks.

Mime
View raw message