nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] Commented: (NUTCH-684) Dedup support for Solr
Date Mon, 09 Mar 2009 16:46:50 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680194#action_12680194
] 

Andrzej Bialecki  commented on NUTCH-684:
-----------------------------------------

Yes, I'm aware of this functionality. At this point however I thought that it would only complicate
things, because users would have to install Nutch classes on Solr in order to use Signature
implementations that we use. This is of course an open issue that we should investigate after
1.0 release.

> Dedup support for Solr
> ----------------------
>
>                 Key: NUTCH-684
>                 URL: https://issues.apache.org/jira/browse/NUTCH-684
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>            Reporter: Doğacan Güney
>            Assignee: Doğacan Güney
>         Attachments: NUTCH-684_bin_nutch.patch, NUTCH-684_solrdedup_v2.patch, solrdedup.patch,
solrdedup_v2.patch
>
>
> After NUTCH-442, nutch now can index to both solr and lucene. However, duplicate deletion
feature (based on digests) is only available in lucene. It should also be available for solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message