lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pratik Thaker <Pratik.Tha...@smartstreamrdu.com>
Subject RE: DistributedUpdateProcessorFactory was explicitly disabled from this updateRequestProcessorChain
Date Mon, 24 Apr 2017 12:27:51 GMT
Hi Alessandro,

Can you please suggest what should be the correct order of adding processors ?

I am having 5 collections, 6 shards, replication factor 2, 3 nodes on 3 separate VMs.

Regards,
Pratik Thaker

-----Original Message-----
From: alessandro.benedetti [mailto:a.benedetti@sease.io]
Sent: 21 April 2017 13:38
To: solr-user@lucene.apache.org
Subject: RE: DistributedUpdateProcessorFactory was explicitly disabled from this updateRequestProcessorChain

Let's make a quick differentiation between PRE and POST processors in a Solr Cloud atchitecture
:

 "In a single node, stand-alone Solr, each update is run through all the update processors
in a chain exactly once. But the behavior of update request processors in SolrCloud deserves
special consideration. " cit. wiki

*PRE PROCESSORS*
All the processors defined BEFORE the distributedUpdateProcessor happen ONLY on the first
node that receive the update ( regardless if it is a leader or a replica ).

*POST PROCESSORS*
The distributedUpdateProcessor will forward the update request to the the correct leader (
or multiple leaders if the request involves more shards), the leader will then forward to
the replicas.
The leaders and replicas at this point will execute all the update request processors defined
AFTER the distributedUpdateProcessor.

" Pre-processors and Atomic Updates
Because DistributedUpdateProcessor is responsible for processing Atomic Updates into full
documents on the leader node, this means that pre-processors which are executed only on the
forwarding nodes can only operate on the partial document. If you have a processor which must
process a full document then the only choice is to specify it as a post-processor."
wiki

In your example, your chain is definitely messed up, the order is important and you want your
heavy processing to happen only on the first node.

For better info and clarification:
https://cwiki.apache.org/confluence/display/solr/Schemaless+Mode ( you can find here a working
alternative to your chain) https://cwiki.apache.org/confluence/display/solr/Update+Request+Processors



-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director Sease Ltd. - www.sease.io
--
View this message in context: http://lucene.472066.n3.nabble.com/DistributedUpdateProcessorFactory-was-explicitly-disabled-from-this-updateRequestProcessorChain-tp4319154p4331215.html
Sent from the Solr - User mailing list archive at Nabble.com.
________________________________
 The information in this email is confidential and may be legally privileged. It is intended
solely for the addressee. Access to this email by anyone else is unauthorised. If you are
not the intended recipient, any disclosure, copying, distribution or any action taken or omitted
to be taken in reliance on it, is prohibited and may be unlawful.

Mime
View raw message