nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sujen Shah (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (NUTCH-2249) WordNet Integration for Cosine Similarity
Date Wed, 13 Apr 2016 17:52:25 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sujen Shah updated NUTCH-2249:
------------------------------
    Fix Version/s: 1.12

> WordNet Integration for Cosine Similarity
> -----------------------------------------
>
>                 Key: NUTCH-2249
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2249
>             Project: Nutch
>          Issue Type: New Feature
>          Components: plugin, scoring
>            Reporter: Bhavya Sanghavi
>            Assignee: Sujen Shah
>            Priority: Minor
>              Labels: memex
>             Fix For: 1.12
>
>
> Integrated WordNet database to enhance the cosine similarity plugin. 
> This helps in reducing the size of the vectors for calculating the cosine similarity
by mapping the synonymous words to the same entry in the vector. Consequently, it would increase
the accuracy of the scores given to the webpages to be crawled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message