nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lin weijian <linweiji...@gmail.com>
Subject DbUpdateReducer could not mark it's batchid
Date Wed, 15 Aug 2012 11:59:08 GMT
        Hi,
        i find a bug in nutch 2.0, which causes  Mark.UPDATEDB_MARK could not mark it's bat
chid.

        Here in org.apache.nutch.crawl.DbUpdateReducer.java ,  reduce function:

        Mark.GENERATE_MARK.removeMarkIfExist(page);
    Mark.FETCH_MARK.removeMarkIfExist(page);
    Utf8 mark = Mark.PARSE_MARK.removeMarkIfExist(page);
    if (mark != null) {
      Mark.UPDATEDB_MARK.putMark(page, mark);
    }

    it clear the generate, fetch & parse bat chid, and set updated bat chid,
    but Mark.UPDATEDB_MARK.putMark(page, mark) could not execute, because
    mark is always null. 

    In gora 0.2, the remove function of StatefulHashMap ,which is called 
     by WebPage's Markers always return null.


    Thanks.
Mime
View raw message