nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yoursoft <>
Subject indexed records in segments
Date Sun, 17 Jul 2005 07:09:51 GMT
Dear Developers,

I maked a modified version of This modification is 
show indexed number of docs too.
I think this is usefully when ballancing segments between backends.

The real number of used docs is the indexed number. This is not equal 
with number of records in the segment. After fetch the number of indexed 
record decreassed to 80-90% of  records. After 'dedup' this percentage 
is more lower value. After 'prune' this value is decreassed more. In 
your segments the real and used records number will be very different.
When you see your balance of  backends, the load avarage will be very 

I think the load avarage is depending the avarage 'boost' value of the 
segment too.

What you think from these thinks?


View raw message