nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lewis John McGibbney (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-945) Indexing to multiple SOLR Servers
Date Sun, 04 Mar 2012 14:13:59 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221898#comment-13221898
] 

Lewis John McGibbney commented on NUTCH-945:
--------------------------------------------

On user@ Julien passed some excellent comments on this one [0]. My opinion is that I would
like to see these incorporated, admittedly I've not checked the patch out Sujit (so please
excuse if these points are addressed). . My justification behind this is simply longevity.
Markus stated 

{bq}"If Solr 4.0 is released in the coming months (and that's what it looks like) i 
would suggest to patch Nutch to allow for a list of Solr server URL's instead 
of doing partitioning on the client site."
{bq}

Which I agree with, however until we witness a Solr 4.0 release (currently sitting @ 348 issues
[2]) I don't see why this can't be integrated into Nutchgora.


[0] http://www.mail-archive.com/user@nutch.apache.org/msg05664.html
[1] http://www.mail-archive.com/user@nutch.apache.org/msg05674.html
[2] https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+SOLR+AND+resolution+%3D+Unresolved+AND+fixVersion+%3D+%224.0%22+ORDER+BY+priority+DESC&mode=hide
                
> Indexing to multiple SOLR Servers
> ---------------------------------
>
>                 Key: NUTCH-945
>                 URL: https://issues.apache.org/jira/browse/NUTCH-945
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.2
>            Reporter: Charan Malemarpuram
>         Attachments: MurmurHashPartitioner.java, NonPartitioningPartitioner.java, patch-NUTCH-945.txt
>
>
> It would be nice to have a default Indexer in Nutch, which can submit docs to multiple
SOLR Servers.
> > Partitioning is always the question, when writing to multiple SOLR Servers.
> > Default partitioning can be a simple hashcode based distribution with addition hooks
to customization.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message