nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Thomsen <mikerthom...@gmail.com>
Subject Re: NiFi fails on cluster nodes
Date Fri, 12 Oct 2018 12:29:45 GMT
Alexander,

I am pretty sure your problem is here:
*nifi.state.management.embedded.zookeeper.start=true*

That spins up an embedded ZooKeeper, which is generally intended to be used
for local development. For example, HBase provides the same feature, but it
is intended to allow you to test a real HBase client application against a
single node of HBase running locally.

What you need to try is these steps:

1. Set up an external ZooKeeper instance (or set up 3 in a quorum; must be
odd numbers)
2. Update nifi.properties on each node to use the external ZooKeeper setup.
3. Restart all of them.

See if that works.

Mike

On Fri, Oct 12, 2018 at 8:13 AM Saip, Alexander (NIH/CC/BTRIS) [C] <
alexander.saip@nih.gov> wrote:

> *nifi.cluster.node.protocol.port=11443* by default on all nodes, I
> haven’t touched that property. Yesterday, we discovered some issues
> preventing two of the boxes from communicating. Now, they can talk okay.
> Ports 11443, 2181 and 3888 are explicitly open in *iptables*, but
> clustering still doesn’t happen. The log files are filled up with errors
> like this:
>
>
>
> 2018-10-12 07:59:08,494 ERROR [Curator-Framework-0]
> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
>
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss
>
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
>
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
>
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
>
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
>
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
>
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
>         at java.lang.Thread.run(Thread.java:748)
>
>
>
> Is there anything else we should check?
>
>
>
> *From:* Nathan Gough <thenatog@gmail.com>
> *Sent:* Thursday, October 11, 2018 9:12 AM
> *To:* users@nifi.apache.org
> *Subject:* Re: NiFi fails on cluster nodes
>
>
>
> You may also need to explicitly open ‘nifi.cluster.node.protocol.port’ on
> all nodes to allow cluster communication for cluster heartbeats etc.
>
>
>
> *From: *ashmeet kandhari <ashmeetkandhari93@gmail.com>
> *Reply-To: *<users@nifi.apache.org>
> *Date: *Thursday, October 11, 2018 at 9:09 AM
> *To: *<users@nifi.apache.org>
> *Subject: *Re: NiFi fails on cluster nodes
>
>
>
> Hi Alexander,
>
>
>
> Can you verify by pinging if the 3 nodes (tcp ping) or run nifi in
> standalone mode and see if you can ping them from other 2 servers just to
> be sure if they can communicate with one another.
>
>
>
> On Thu, Oct 11, 2018 at 11:49 AM Saip, Alexander (NIH/CC/BTRIS) [C] <
> alexander.saip@nih.gov> wrote:
>
> How do I do that? The *nifi.properties* file on each node includes ‘
> *nifi.state.management.embedded.zookeeper.start=true’*, so I assume
> Zookeeper does start.
>
>
>
> *From:* ashmeet kandhari <ashmeetkandhari93@gmail.com>
> *Sent:* Thursday, October 11, 2018 4:36 AM
> *To:* users@nifi.apache.org
> *Subject:* Re: NiFi fails on cluster nodes
>
>
>
> Can you see if zookeeper node is up and running and can connect to the
> nifi nodes
>
>
>
> On Wed, Oct 10, 2018 at 7:34 PM Saip, Alexander (NIH/CC/BTRIS) [C] <
> alexander.saip@nih.gov> wrote:
>
> Hello,
>
>
>
> We have three NiFi 1.7.1 nodes originally configured as independent
> instances, each on its own server. There is no firewall between them. When
> I tried to build a cluster following instructions here
> <https://mintopsblog.com/2017/11/12/apache-nifi-cluster-configuration/>,
> NiFi failed to start on all of them, despite the fact that I even set *
> nifi.cluster.protocol.is.secure=false* in the *nifi.properties* file on
> each node. Here is the error in the log files:
>
>
>
> 2018-10-10 13:57:07,506 INFO [main] org.apache.nifi.NiFi Launching NiFi...
>
> 2018-10-10 13:57:07,745 INFO [main]
> o.a.nifi.properties.NiFiPropertiesLoader Determined default nifi.properties
> path to be '/opt/nifi-1.7.1/./conf/nifi.properties'
>
> 2018-10-10 13:57:07,748 INFO [main]
> o.a.nifi.properties.NiFiPropertiesLoader Loaded 125 properties from
> /opt/nifi-1.7.1/./conf/nifi.properties
>
> 2018-10-10 13:57:07,755 INFO [main] org.apache.nifi.NiFi Loaded 125
> properties
>
> 2018-10-10 13:57:07,762 INFO [main] org.apache.nifi.BootstrapListener
> Started Bootstrap Listener, Listening for incoming requests on port 43744
>
> 2018-10-10 13:59:15,056 ERROR [main] org.apache.nifi.NiFi Failure to
> launch NiFi due to java.net.ConnectException: Connection timed out
> (Connection timed out)
>
> java.net.ConnectException: Connection timed out (Connection timed out)
>
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>
>         at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>
>         at
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>
>         at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>
>         at java.net.Socket.connect(Socket.java:589)
>
>         at java.net.Socket.connect(Socket.java:538)
>
>         at
> org.apache.nifi.BootstrapListener.sendCommand(BootstrapListener.java:100)
>
>         at
> org.apache.nifi.BootstrapListener.start(BootstrapListener.java:83)
>
>         at org.apache.nifi.NiFi.<init>(NiFi.java:102)
>
>         at org.apache.nifi.NiFi.<init>(NiFi.java:71)
>
>         at org.apache.nifi.NiFi.main(NiFi.java:292)
>
> 2018-10-10 13:59:15,058 INFO [Thread-1] org.apache.nifi.NiFi Initiating
> shutdown of Jetty web server...
>
> 2018-10-10 13:59:15,059 INFO [Thread-1] org.apache.nifi.NiFi Jetty web
> server shutdown completed (nicely or otherwise).
>
>
>
> Without clustering, the instances had no problem starting. Since this is
> our first experiment building a cluster, I’m not sure where to look for
> clues.
>
>
>
> Thanks in advance,
>
>
>
> Alexander
>
>

Mime
View raw message