nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sumittyagi <>
Subject Filtering Pages while crawling
Date Tue, 17 Nov 2009 18:48:48 GMT

How can I filter certain pages like Privacy Policies, Terms and conditions
etc from crawling, because all these pages contains bogus information. I am
new to nutch.
Please let me know about this. 

Thanks in Advance.
View this message in context:
Sent from the Nutch - Dev mailing list archive at

View raw message