nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "kauu" <bab...@gmail.com>
Subject hi all:
Date Thu, 14 Dec 2006 14:28:58 GMT
 

I want to know that how can I filte the spam from a fetched html page?

For example , nutch fetched a news page ,but there are so much spam info
besides the useful.so how can filter them?

Any reply will be appreciated!


Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message