nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From zeroleaf <>
Subject Nutch won't fetch the whole page if the Transfer Dncoding is chunked
Date Tue, 16 Sep 2014 14:59:46 GMT
     These days, when I use nutch, I found that if the Transfer Dncoding is
chunked, then nutch will not fetch the whole page and only part of it. 
Is it
right in nutch or is it a bug? If it is right, then how to config to 
fetch the
whole page?

For example, add the url below to seed dir

then, find fetched html in content, will find it is only a part. In 
addition, the
version I test is Nutch 1.x(1.9 and 1.10).


View raw message