nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject DistributedSearch$Client.updateSegments() blocking other threads
Date Thu, 15 Sep 2005 15:01:33 GMT

I was doing performance testing of a distributed search setup, with 
JMeter, using the code from trunk/.

Whenever one of the backend Servers goes down, there is a hiccup on the 
frontend, because all ParallelCalls started by the Client, which still 
use that dead address, need to timeout. This is expected, and acceptable.

New calls being made in the meantime (before updateSegments() discovers 
that the host is down) will also need to timeout - which is so so, I 
think it could be improved by removing the offending address at the 
first sign of trouble, i.e. not to wait for updateSegments() but 
immediately remove the dead host from liveAddresses. Anyway, read on...

What was curious was that the same hiccup would then occur every 10 
seconds, which is the hardcoded interval for calling 
Client.updateSegments(). It was as if the call to updateSegments() was 
synchronized on the whole class, so that all other calls are blocked 
until updateSegments() completes. I modified the code, so that instead 
of using DistributedSearch$Client itself as a Thread instance, a new 
independent Thread instance is created.

The hiccups are gone now - the list of liveAddresses is still being 
updated as it should whenever Servers go down/up, but now 
updateSegments() doesn't interfere with other calls. I attach the patch 
- but to be honest I'm still not quite sure what was happening...

Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

View raw message