nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Francesco Capponi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (NUTCH-2275) MD5Signature by default doesn't take in account parse
Date Wed, 08 Jun 2016 11:04:20 GMT
Francesco Capponi created NUTCH-2275:
----------------------------------------

             Summary: MD5Signature by default doesn't take in account parse
                 Key: NUTCH-2275
                 URL: https://issues.apache.org/jira/browse/NUTCH-2275
             Project: Nutch
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.11
            Reporter: Francesco Capponi


I'm testing Apache Nutch with the feed's plugin. I've noticed that for each page it generates
the same digest/signature, therefore the dedup cleans everything up from the database.

I'm wondering why the class MD5Signature is the default one instead of TextMD5Signature.

Anyhow now I've modified a little bit the MD5Signature to let it work with the feed plugin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message