nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Dlugolinsky <>
Subject Re: Plugins: when to perform web service requests, on fetch or on index?
Date Thu, 18 Jun 2009 13:55:41 GMT

well, I would say that indexing stage is better than parsing, because in
parsing stage there can be many parsing filters, which need to be execuded
and they need some system resources (there are several parallel threads
running), but generaly, there might not be any difference in performance
according to calling stage. Also there can be more indexing filters, which
also need some system resources. I would try both variants, measure
performance on some subset of documents, compare the results and choose
better. In addition of raising the performance, I would try to cache
webservice requests localy, it can save something on repeating calls.


2009/6/18 caezar <>

> Hi,
> Thank you for the response. Parsed data is not used in calls. Only page
> URL.
> So performance will be better if perform this requests on parsing stage?

View raw message