nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Handl, Jorge" <>
Subject linksByMD5
Date Mon, 05 Sep 2005 19:51:28 GMT

I'm writing a webdb purger, and I have an issue with writing to the new db
the links of the pages that haven't been purged.

The docs seem to imply that adding a link having a source page that is not
present in the webdb should fail, but apparently it doesn't.

So I try to filter out the links that shouldn't be inserted, but I can't
access the links by MD5, even though I find both linksByURL and linksByMD5
directories in the webdb... Why is that so?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message