nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roannel Fernández Hernández (JIRA) <j...@apache.org>
Subject [jira] [Commented] (NUTCH-1480) SolrIndexer to write to multiple servers.
Date Wed, 23 Aug 2017 14:41:00 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138433#comment-16138433
] 

Roannel Fernández Hernández commented on NUTCH-1480:
----------------------------------------------------

I’m testing a solution which use this file [1] to configure the index writers. On this XML
file, we could put into every tag "writer" the parameters used by the writer and a mapping
for every field of the Nutch documents. With this new way of using the writers in Nutch, we
could have so many field mappings, not only for the Solr index writer, but also for every
index writer that we have. Also we will be able to define different configurations for index
writers, even for the same IndexWriter class. This solution is applied to all types of index
writers, not just for Solr index writer.

The structure of [1] is described in [2].

[1] https://github.com/r0ann3l/nutch/blob/NUTCH-1480/conf/index-writers.xml.template
[2] https://github.com/r0ann3l/nutch/blob/NUTCH-1480/conf/index-writers.xsd

> SolrIndexer to write to multiple servers.
> -----------------------------------------
>
>                 Key: NUTCH-1480
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1480
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>            Priority: Minor
>         Attachments: adding-support-for-sharding-indexer-for-solr.patch, NUTCH-1480-1.6.1.patch
>
>
> SolrUtils should return an array of SolrServers and read the SolrUrl as a comma delimited
list of URL's using Configuration.getString(). SolrWriter should be able to handle this list
of SolrServers.
> This is useful if you want to send documents to multiple servers if no replication is
available or if you want to send documents to multiple NOCs.
> edit:
> This does not replace NUTCH-1377 but complements it. With NUTCH-1377 this issue allows
you to index to multiple SolrCloud clusters at the same time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message