nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From AJ Chen <cano...@gmail.com>
Subject what contibute to fetch slowing down
Date Wed, 28 Sep 2005 17:27:55 GMT
I started the crawler with about 2000 sites.  The fetcher could achieve 
7 pages/sec initially, but the performance gradually dropped to about 2 
pages/sec, sometimes even 0.5 pages/sec.  The fetch list had 300k pages 
and I used 500 threads. What are the main causes of this slowing down? 
Below are sample status:

050927 005952 status: segment 20050927005922, 100 pages, 3 errors, 
1784615 bytes, 14611 ms
050927 005952 status: 6.8441586 pages/s, 954.2334 kb/s, 17846.15 bytes/page
050927 010005 status: segment 20050927005922, 200 pages, 9 errors, 
3656863 bytes, 28170 ms
050927 010005 status: 7.0997515 pages/s, 1014.1726 kb/s, 18284.314 
bytes/page

after sometime ...
050927 171818 status: segment 20050927070752, 101400 pages, 7201 errors, 
2593026554 bytes, 36216316 ms
050927 171818 status: 2.799843 pages/s, 559.3617 kb/s, 25572.254 bytes/page
050927 171832 status: segment 20050927070752, 101500 pages, 7204 errors, 
2595591632 bytes, 36230516 ms
050927 171832 status: 2.8015058 pages/s, 559.6956 kb/s, 25572.332 bytes/page

Thanks,
AJ


Mime
View raw message