nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Asitang Mishra (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (NUTCH-2126) Use selenium protocol for specific sites
Date Mon, 28 Sep 2015 18:15:04 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Asitang Mishra updated NUTCH-2126:
----------------------------------
    Summary: Use selenium protocol for specific sites  (was: Use selenium protocol for specific
sites when switched on )

> Use selenium protocol for specific sites
> ----------------------------------------
>
>                 Key: NUTCH-2126
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2126
>             Project: Nutch
>          Issue Type: Sub-task
>          Components: fetcher
>            Reporter: Asitang Mishra
>
> Right now if one uses selenium or seleniuminteractive plugins. The fetcher uses them
for all the fetches. There will be situations where we don't want to go through the overhead
of using selenium for all the seeds. 
> Can provide some standardized key value pairs tell the protocol recognizer in nutch that
certain seeds will be used with selenium plugin. Later on we can keep appending these key
value pairs to the outlinks or only outlinks that are of the same domain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message