incubator-droids-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tony Dietrich" <t...@dietrich.org.uk>
Subject Filtering question: Is there a way...
Date Thu, 20 Oct 2011 14:05:50 GMT
... when using droids to crawl a site, to read and parse pages that have got
thru the filters that have been set up, but stop them being passed to the
handler?

 

IE, some sort of post-parsing filter like the AlreadyVisitedFilter but which
is applied after the page has been parsed for new links but before the
handler is triggered?

 

Or do I have to wait until it hits the handler then find check my cache to
see if I've already got that page? (Just trying to separate out my
processing stages).

 

Tony Dietrich

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message