nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Semyon Semyonov (JIRA)" <j...@apache.org>
Subject [jira] [Created] (NUTCH-2504) Results of maxCountExpr and fetchDelayExpr should be stored in memory in Generate
Date Thu, 25 Jan 2018 14:55:00 GMT
Semyon Semyonov created NUTCH-2504:
--------------------------------------

             Summary: Results of maxCountExpr and fetchDelayExpr should be stored in memory
in Generate
                 Key: NUTCH-2504
                 URL: https://issues.apache.org/jira/browse/NUTCH-2504
             Project: Nutch
          Issue Type: Sub-task
          Components: generator
            Reporter: Semyon Semyonov


With NUTCH-2455 the expressions maxCountExpr and fetchDelayExpr are calculated for each value.
That slows the process, instead we can store the results for each host in hostDomainCounts. 

That will take only 2 x sizeof(long) extra memory per host.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message