nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bhavya Sanghavi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (NUTCH-2249) WordNet Integration for Cosine Similarity
Date Tue, 12 Apr 2016 22:16:25 GMT
Bhavya Sanghavi created NUTCH-2249:
--------------------------------------

             Summary: WordNet Integration for Cosine Similarity
                 Key: NUTCH-2249
                 URL: https://issues.apache.org/jira/browse/NUTCH-2249
             Project: Nutch
          Issue Type: New Feature
          Components: plugin, scoring
            Reporter: Bhavya Sanghavi
            Priority: Minor


Integrated WordNet database to enhance the cosine similarity plugin. 
This helps in reducing the size of the vectors for calculating the cosine similarity by mapping
the synonymous words to the same entry in the vector. Consequently, it would increase the
accuracy of the scores given to the webpages to be crawled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message