nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ddewaele <>
Subject NiFi Cluster with lots of SUSPENDED, RECONNECTED, LOST events
Date Tue, 13 Jun 2017 21:35:33 GMT
We have a node nifi cluster running with 3 zookeeper instances (replicated)
in a Docker Swarm Cluster.

Most of time the cluster is operating fine, but from time to time we notice
that Nifi stops processing messages completely. It eventually resumes after
a while (sometimes after a couple of seconds, sometimes after a couple of

When I do a grep o.a.n.c.l.e.CuratorLeaderElectionManager
/srv/nifi/logs/nifi-app.log on the primary node, I see a lof of suspended /
reconnected messages.

Likewise on the other node, I see similar messages

The only real exceptions I'm seeing in the logs are these

I also this on the UI from time to time :

com.sun.jersey.api.client.ClientHandlerException: Read timed out

Is there anything I can do to further debug this ? 
Is it normal to see that many connection state changes ? (the logs are full
of them).
The solution is running on 3 VMs, using Docker Swarm. Nifi is running on 2
of those 3 VMs. We have a zookeeper setup running on all 3 VMs.

I don't see any errors in the zookeeper logs.

View this message in context:
Sent from the Apache NiFi Users List mailing list archive at

View raw message