nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Gilman <matt.c.gil...@gmail.com>
Subject Re: Resetting counters whilst clustered disconnects nodes
Date Fri, 04 Sep 2015 12:20:54 GMT
Tommy,

Thanks for the great write up! I've replicated the issue of the node disconnecting using the
steps you've provided. I've created a JIRA for the issue [1]. For the other concern, that
is how it's currently designed to work. The 'run on primary node only' applies when a node
is part of a cluster. If a node is disconnected from a cluster and a processor is configured
with that scheduling strategy the processor will run as though it's timer driven.

We should have the counters issue addressed for the upcoming 0.3.0 release.

Thanks!

Matt

[1] https://issues.apache.org/jira/browse/NIFI-926

> On Thu, Sep 3, 2015 at 5:14 AM, tommy.yardley@baesystems.com <tommy.yardley@baesystems.com>
wrote:
> Hi,
> 
> I have a three machine setup (1 NCM + 2 Nodes) running 0.2.0-incubating and observed
the following:
> 
> 
> 1.       Resetting counters can result in the MCN disconnecting a node
> 
> 2.       The node that is disconnected begins processing FlowFiles
> 
> Description:
> 
> My clustered NiFi is running a single pipeline containing 3 processors. While the pipeline
is running, resetting counters will result in any nodes which are not processing anything
(i.e. are not contributing to the count) to disconnect. The node can then be reconnected via
the UI. Looking at the stats it appears the pipeline then began running on the disconnected
node, as well as the single remaining connected node. This has been tested using custom processors
as well as standard processors.
> 
> Steps to Replicate:
> 
> 
> 1.       Create cluster with 2 nodes + 1 MCN (2 nodes for processing are needed or the
problem won't appear)
> 
> 2.       Add GenerateFlowFile processor:
> 
> a.       Scheduling: Change Scheduling strategy to 'On primary node'
> 
> b.      Properties: Change File Size to '10B' (say)
> 
> 3.       Add HashAttribute processor:
> 
> a.       Properties: Change Key to 'hash.value'
> 
> 4.       Add DetectDuplicate processor:
> 
> a.       Properties: Under Distributed Cache Service add a 'DistributedMapCacheClientService'
> 
>                                                                i.      For the Client
Service Add Server name to 'localhost' under properties
> 
>                                                              ii.      Enable The Client
Service
> 
>                                                             iii.      Add a DistrubtedMapCacheServer
under the Controller Services
> 
>                                                            iv.      Enable the Cache
Server
> 
>                                                              v.      Exit NiFi Flow Settings
> 
> 5.       Connect all 3 processors on success
> 
> 6.       Auto-terminate all options for DetectDuplicate
> 
> 7.       Run all processors and wait for ~10seconds or so
> 
> 8.       Open counters tab and refresh to make sure counters > 0
> 
> 9.       Reset one of the counters
> 
> Note: I'm specifically using the DetectDuplicate processor in this example because it
contains a custom counter.
> 
> This should then disconnect the node that was not active (node that was not selected
to be the primary). Even though the GenerateFlowFile processor is scheduled to run on the
primary node the disconnected node begins to emit FlowFiles.
> 
> The following Warning was pulled from the MCNs logs:
> 
> 2015-09-02 10:40:16,750 WARN [NiFi Web Server-149] o.a.n.c.manager.impl.WebClusterManager
One or more nodes failed to process URI 'http://localhost:8082/nifi-api/controller/counters/2207ea22-0d4a-389d-b746-82e568c6228d'.
 Requesting each node to disconnect from cluster.
> 
> I'm interested in knowing if this is expected behaviour or if I should open a JIRA ticket
(2 perhaps).
> 
> Thanks,
> Tommy
> Please consider the environment before printing this email. This message should be regarded
as confidential. If you have received this email in error please notify the sender and destroy
it immediately. Statements of intent shall only become binding when confirmed in hard copy
by an authorised signatory. The contents of this email may relate to dealings with other companies
under the control of BAE Systems Applied Intelligence Limited, details of which can be found
at http://www.baesystems.com/Businesses/index.htm.


Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message