nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Owen Lin <oal...@nyu.edu>
Subject Re: [jira] [Commented] (NUTCH-2039) Relevance based scoring filter
Date Mon, 15 Jun 2015 23:15:13 GMT
Please unsubscribe!!!
On Jun 15, 2015 4:14 PM, "Sujen Shah (JIRA)" <jira@apache.org> wrote:

>
>     [
> https://issues.apache.org/jira/browse/NUTCH-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587113#comment-14587113
> ]
>
> Sujen Shah commented on NUTCH-2039:
> -----------------------------------
>
> Done, updated the PR.
>
> > Relevance based scoring filter
> > ------------------------------
> >
> >                 Key: NUTCH-2039
> >                 URL: https://issues.apache.org/jira/browse/NUTCH-2039
> >             Project: Nutch
> >          Issue Type: New Feature
> >            Reporter: Sujen Shah
> >              Labels: memex, nutch
> >             Fix For: 1.11
> >
> >
> > A ScoringFilter plugin that uses a similarity measure to calculate the
> similarity between a given page(gold standard) and the currently parsed
> page. The score obtained from this similarity is then distributed to its
> outlinks. This filter aims to focus the crawler to crawl/explore relevant
> pages.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>

Mime
View raw message