nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yoursoft <yours...@freemail.hu>
Subject indexed records in segments
Date Sun, 17 Jul 2005 07:09:51 GMT
Dear Developers,

I maked a modified version of SegmentRead.java. This modification is 
show indexed number of docs too.
I think this is usefully when ballancing segments between backends.

The real number of used docs is the indexed number. This is not equal 
with number of records in the segment. After fetch the number of indexed 
record decreassed to 80-90% of  records. After 'dedup' this percentage 
is more lower value. After 'prune' this value is decreassed more. In 
your segments the real and used records number will be very different.
When you see your balance of  backends, the load avarage will be very 
different.

I think the load avarage is depending the avarage 'boost' value of the 
segment too.

What you think from these thinks?

Regards,
    Ferenc

Mime
View raw message