nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Derraugh <neil.derra...@intellifylearning.com>
Subject Re: Phantom node
Date Tue, 09 May 2017 20:18:51 GMT
Hmm.. This problem has happened again.  I've got 2 phantom nodes,
6c538699-985b-47c3-9e91-14c54a225167 and
2d6619c6-afbd-46f4-b612-3f79d5daa714.  Those are two of three previously
healthy nodes from a recent deployment.  The third that was part of that
deployment (there are always 3) is e0ba6076-e561-4899-9529-3e0039c7a6b9.  I
suspect this may be related to uncleanly shutting down these nodes.  Does
that seem likely?

{
  "cluster": {
    "nodes": [
      {
        "nodeId": "6c538699-985b-47c3-9e91-14c54a225167",
        "address": "x.y.a.67",
        "apiPort": 31896,
        "status": "CONNECTED",
        "roles": [],
        "events": []
      },
      {
        "nodeId": "2d6619c6-afbd-46f4-b612-3f79d5daa714",
        "address": "x.y.c.212",
        "apiPort": 31399,
        "status": "CONNECTED",
        "roles": [],
        "events": []
      },
      {
        "nodeId": "b8a208eb-7de7-4786-8955-2f690b620371",
        "address": "x.y.b.202",
        "apiPort": 31053,
        "status": "DISCONNECTED",
        "roles": [],
        "events": [
          {
            "timestamp": "05/09/2017 17:46:25 GMT",
            "category": "INFO",
            "message": "Node Status changed from CONNECTED to DISCONNECTED
due to Have not received a heartbeat from node in 171 seconds"
          }
        ]
      },
      {
        "nodeId": "1c9d120b-0109-4e64-aef5-f13c31761621",
        "address": "x.y.c.45",
        "apiPort": 31049,
        "status": "CONNECTED",
        "heartbeat": "05/09/2017 20:07:00 GMT",
        "roles": [],
        "activeThreadCount": 0,
        "queued": "0 / 0 bytes",
        "events": [
          {
            "timestamp": "05/09/2017 17:55:06 GMT",
            "category": "INFO",
            "message": "Received first heartbeat from connecting node. Node
connected."
          },
          {
            "timestamp": "05/09/2017 17:52:56 GMT",
            "category": "INFO",
            "message": "Connection requested from existing node. Setting
status to connecting."
          },
          {
            "timestamp": "05/09/2017 17:52:46 GMT",
            "category": "INFO",
            "message": "Connection requested from existing node. Setting
status to connecting."
          }
        ],
        "nodeStartTime": "05/09/2017 17:52:37 GMT"
      },
      {
        "nodeId": "6d8b4ecd-70c9-4cf3-8fe0-87e618933385",
        "address": "x.y.b.80",
        "apiPort": 31308,
        "status": "CONNECTED",
        "roles": [],
        "events": []
      },
      {
        "nodeId": "fcf40c91-3093-436b-88da-23620df802f1",
        "address": "x.y.a.44",
        "apiPort": 31050,
        "status": "CONNECTED",
        "heartbeat": "05/09/2017 20:07:00 GMT",
        "roles": [],
        "activeThreadCount": 0,
        "queued": "0 / 0 bytes",
        "events": [
          {
            "timestamp": "05/09/2017 17:48:37 GMT",
            "category": "INFO",
            "message": "Node Status changed from CONNECTING to CONNECTED"
          },
          {
            "timestamp": "05/09/2017 17:47:09 GMT",
            "category": "INFO",
            "message": "Node Status changed from [Unknown Node] to
CONNECTING"
          }
        ],
        "nodeStartTime": "05/09/2017 17:47:07 GMT"
      },
      {
        "nodeId": "6fe93b13-0152-4a46-85de-0c7fb7de547a",
        "address": "x.y.b.4",
        "apiPort": 31639,
        "status": "CONNECTED",
        "heartbeat": "05/09/2017 20:07:00 GMT",
        "roles": [
          "Primary Node",
          "Cluster Coordinator"
        ],
        "activeThreadCount": 0,
        "queued": "0 / 0 bytes",
        "events": [
          {
            "timestamp": "05/09/2017 17:44:17 GMT",
            "category": "INFO",
            "message": "Node Status changed from CONNECTING to CONNECTED"
          },
          {
            "timestamp": "05/09/2017 17:44:05 GMT",
            "category": "INFO",
            "message": "Node Status changed from [Unknown Node] to
CONNECTING"
          }
        ],
        "nodeStartTime": "05/09/2017 17:44:03 GMT"
      },
      {
        "nodeId": "c8d1373b-252c-4253-ab0b-4a396885f560",
        "address": "x.y.a.204",
        "apiPort": 31414,
        "status": "DISCONNECTED",
        "roles": [],
        "events": []
      },
      {
        "nodeId": "e0ba6076-e561-4899-9529-3e0039c7a6b9",
        "address": "x.y.c.132",
        "apiPort": 31718,
        "status": "DISCONNECTED",
        "roles": [],
        "events": [
          {
            "timestamp": "05/09/2017 17:50:44 GMT",
            "category": "INFO",
            "message": "Node Status changed from CONNECTED to DISCONNECTED
due to Have not received a heartbeat from node in 257 seconds"
          }
        ]
      }
    ],
    "generated": "20:07:04 GMT"
  }
}

On Sat, May 6, 2017 at 9:38 PM, Neil Derraugh <
neil.derraugh@intellifylearning.com> wrote:

> No I just meant the original phantom node that seems to be in the cluster
> even though I think the VM's been decommissioned.
>
> On Fri, May 5, 2017 at 1:14 AM, Jeff <jtswork@gmail.com> wrote:
>
>> Neil,
>>
>> Quick question... by "The nodes that are healthy are all able to ping
>> each other", is one of the nodes not able to ping the other nodes?
>>
>> On Thu, May 4, 2017 at 11:23 AM Neil Derraugh <
>> neil.derraugh@intellifylearning.com> wrote:
>>
>>> Our three zookeepers are external.  The nodes that are healthy are all
>>> able to ping each other.  nifi.properties follow (with minor mods around
>>> hostnames/IPs).
>>>
>>> # Core Properties #
>>> nifi.version=1.1.2
>>> nifi.flow.configuration.file=/mnt/mesos/sandbox/conf/flow.xml.gz
>>> nifi.flow.configuration.archive.enabled=true
>>> nifi.flow.configuration.archive.dir=/mnt/mesos/sandbox/conf/archive/
>>> nifi.flow.configuration.archive.max.time=30 days
>>> nifi.flow.configuration.archive.max.storage=500 MB
>>> nifi.flowcontroller.autoResumeState=true
>>> nifi.flowcontroller.graceful.shutdown.period=10 sec
>>> nifi.flowservice.writedelay.interval=500 ms
>>> nifi.administrative.yield.duration=30 sec
>>> # If a component has no work to do (is "bored"), how long should we wait
>>> before checking again for work?
>>> nifi.bored.yield.duration=10 millis
>>> nifi.authorizer.configuration.file=/mnt/mesos/sandbox/conf/a
>>> uthorizers.xml
>>> nifi.login.identity.provider.configuration.file=/mnt/mesos/s
>>> andbox/conf/login-identity-providers.xml
>>> nifi.templates.directory=/mnt/mesos/sandbox/conf/templates
>>> nifi.ui.banner.text=master - dev3
>>> nifi.ui.autorefresh.interval=30 sec
>>> nifi.nar.library.directory=/opt/nifi/lib
>>> nifi.nar.library.directory.custom=/mnt/mesos/sandbox/lib
>>> nifi.nar.working.directory=/mnt/mesos/sandbox/work/nar/
>>> nifi.documentation.working.directory=/mnt/mesos/sandbox/work
>>> /docs/components
>>> ####################
>>> # State Management #
>>> ####################
>>> nifi.state.management.configuration.file=/mnt/mesos/sandbox/
>>> conf/state-management.xml
>>> # The ID of the local state provider
>>> nifi.state.management.provider.local=local-provider
>>> # The ID of the cluster-wide state provider. This will be ignored if
>>> NiFi is not clustered but must be populated if running in a cluster.
>>> nifi.state.management.provider.cluster=zk-provider
>>> # Specifies whether or not this instance of NiFi should run an embedded
>>> ZooKeeper server
>>> nifi.state.management.embedded.zookeeper.start=false
>>> # Properties file that provides the ZooKeeper properties to use if
>>> <nifi.state.management.embedded.zookeeper.start> is set to true
>>> nifi.state.management.embedded.zookeeper.properties=./conf/
>>> zookeeper.properties
>>> # H2 Settings
>>> nifi.database.directory=/mnt/mesos/sandbox/data/database_repository
>>> nifi.h2.url.append=;LOCK_TIMEOUT=25000;WRITE_DELAY=0;AUTO_SERVER=FALSE
>>> # FlowFile Repository
>>> nifi.flowfile.repository.implementation=org.apache.nifi.
>>> controller.repository.WriteAheadFlowFileRepository
>>> nifi.flowfile.repository.directory=/mnt/mesos/sandbox/data/
>>> flowfile_repository
>>> nifi.flowfile.repository.partitions=256
>>> nifi.flowfile.repository.checkpoint.interval=2 mins
>>> nifi.flowfile.repository.always.sync=false
>>> nifi.swap.manager.implementation=org.apache.nifi.controller.
>>> FileSystemSwapManager
>>> nifi.queue.swap.threshold=20000
>>> nifi.swap.in.period=5 sec
>>> nifi.swap.in.threads=1
>>> nifi.swap.out.period=5 sec
>>> nifi.swap.out.threads=4
>>> # Content Repository
>>> nifi.content.repository.implementation=org.apache.nifi.
>>> controller.repository.FileSystemRepository
>>> nifi.content.claim.max.appendable.size=10 MB
>>> nifi.content.claim.max.flow.files=100
>>> nifi.content.repository.directory.default=/mnt/mesos/sandbox
>>> /data/content_repository
>>> nifi.content.repository.archive.max.retention.period=12 hours
>>> nifi.content.repository.archive.max.usage.percentage=50%
>>> nifi.content.repository.archive.enabled=true
>>> nifi.content.repository.always.sync=false
>>> nifi.content.viewer.url=/nifi-content-viewer/
>>> # Provenance Repository Properties
>>> nifi.provenance.repository.implementation=org.apache.nifi.
>>> provenance.PersistentProvenanceRepository
>>> # Persistent Provenance Repository Properties
>>> nifi.provenance.repository.directory.default=/mnt/mesos/sand
>>> box/data/provenance_repository
>>> nifi.provenance.repository.max.storage.time=24 hours
>>> nifi.provenance.repository.max.storage.size=1 GB
>>> nifi.provenance.repository.rollover.time=30 secs
>>> nifi.provenance.repository.rollover.size=100 MB
>>> nifi.provenance.repository.query.threads=2
>>> nifi.provenance.repository.index.threads=1
>>> nifi.provenance.repository.compress.on.rollover=true
>>> nifi.provenance.repository.always.sync=false
>>> nifi.provenance.repository.journal.count=16
>>> # Comma-separated list of fields. Fields that are not indexed will not
>>> be searchable. Valid fields are:
>>> # EventType, FlowFileUUID, Filename, TransitURI, ProcessorID,
>>> AlternateIdentifierURI, Relationship, Details
>>> nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID,
>>> Filename, ProcessorID, Relationship
>>> # FlowFile Attributes that should be indexed and made searchable.  Some
>>> examples to consider are filename, uuid, mime.type
>>> nifi.provenance.repository.indexed.attributes=
>>> # Large values for the shard size will result in more Java heap usage
>>> when searching the Provenance Repository
>>> # but should provide better performance
>>> nifi.provenance.repository.index.shard.size=500 MB
>>> # Indicates the maximum length that a FlowFile attribute can be when
>>> retrieving a Provenance Event from
>>> # the repository. If the length of any attribute exceeds this value, it
>>> will be truncated when the event is retrieved.
>>> nifi.provenance.repository.max.attribute.length=65536
>>> # Volatile Provenance Respository Properties
>>> nifi.provenance.repository.buffer.size=100000
>>> # Component Status Repository
>>> nifi.components.status.repository.implementation=org.apache.
>>> nifi.controller.status.history.VolatileComponentStatusRepository
>>> nifi.components.status.repository.buffer.size=1440
>>> nifi.components.status.snapshot.frequency=1 min
>>> # Site to Site properties
>>> nifi.remote.input.host=w.x.y.z
>>> nifi.remote.input.secure=false
>>> nifi.remote.input.socket.port=31310
>>> nifi.remote.input.http.enabled=true
>>> nifi.remote.input.http.transaction.ttl=30 sec
>>> # web properties #
>>> nifi.web.war.directory=/opt/nifi/lib
>>> nifi.web.http.host=w.x.y.z
>>> nifi.web.http.port=31308
>>> nifi.web.https.host=
>>> #w.x.y.z
>>> nifi.web.https.port=
>>> #31268
>>> nifi.web.jetty.working.directory=/mnt/mesos/sandbox/work/jetty
>>> nifi.web.jetty.threads=200
>>> # security properties #
>>> nifi.sensitive.props.key=
>>> nifi.sensitive.props.key.protected=
>>> nifi.sensitive.props.algorithm=PBEWITHMD5AND256BITAES-CBC-OPENSSL
>>> nifi.sensitive.props.provider=BC
>>> nifi.sensitive.props.additional.keys=
>>> nifi.security.keystore=
>>> nifi.security.keystoreType=
>>> nifi.security.keystorePasswd=
>>> nifi.security.keyPasswd=
>>> nifi.security.truststore=
>>> nifi.security.truststoreType=
>>> nifi.security.truststorePasswd=
>>> nifi.security.needClientAuth=
>>> nifi.security.user.authorizer=file-provider
>>> nifi.security.user.login.identity.provider=
>>> #ldap-provider
>>> nifi.security.ocsp.responder.url=
>>> nifi.security.ocsp.responder.certificate=
>>> # Identity Mapping Properties #
>>> # These properties allow normalizing user identities such that
>>> identities coming from different identity providers
>>> # (certificates, LDAP, Kerberos) can be treated the same internally in
>>> NiFi. The following example demonstrates normalizing
>>> # DNs from certificates and principals from Kerberos into a common
>>> identity string:
>>> #
>>> # nifi.security.identity.mapping.pattern.dn=^CN=(.*?), OU=(.*?),
>>> O=(.*?), L=(.*?), ST=(.*?), C=(.*?)$
>>> # nifi.security.identity.mapping.value.dn=$1@$2
>>> # nifi.security.identity.mapping.pattern.kerb=^(.*?)/instance@(.*?)$
>>> # nifi.security.identity.mapping.value.kerb=$1@$2
>>> # cluster common properties (all nodes must have same values) #
>>> nifi.cluster.protocol.heartbeat.interval=5 sec
>>> nifi.cluster.protocol.is.secure=false
>>> # cluster node properties (only configure for cluster nodes) #
>>> nifi.cluster.is.node=true
>>> nifi.cluster.node.address=w.x.y.z
>>> nifi.cluster.node.protocol.port=31267
>>> nifi.cluster.node.protocol.threads=10
>>> nifi.cluster.node.event.history.size=25
>>> nifi.cluster.node.connection.timeout=5 sec
>>> nifi.cluster.node.read.timeout=5 sec
>>> nifi.cluster.firewall.file=
>>> nifi.cluster.flow.election.max.wait.time=1 mins
>>> nifi.cluster.flow.election.max.candidates=3
>>> # zookeeper properties, used for cluster management #
>>> nifi.zookeeper.connect.string=zookeepers.some.uri:2181
>>> nifi.zookeeper.connect.timeout=3 secs
>>> nifi.zookeeper.session.timeout=3 secs
>>> nifi.zookeeper.root.node=/nifi
>>> # kerberos #
>>> nifi.kerberos.krb5.file=
>>> # kerberos service principal #
>>> nifi.kerberos.service.principal=
>>> nifi.kerberos.service.keytab.location=
>>> # kerberos spnego principal #
>>> nifi.kerberos.spnego.principal=
>>> nifi.kerberos.spnego.keytab.location=
>>> nifi.kerberos.spnego.authentication.expiration=12 hours
>>> # external properties files for variable registry
>>> # supports a comma delimited list of file locations
>>> nifi.variable.registry.properties=
>>> # Build info
>>> nifi.build.tag=nifi-1.1.2-RC1
>>> nifi.build.branch=NIFI-3486-RC1
>>> nifi.build.revision=744cfe6
>>> nifi.build.timestamp=2017-02-16T01:48:27Z
>>>
>>> On Wed, May 3, 2017 at 9:27 PM, Jeff <jtswork@gmail.com> wrote:
>>>
>>>> Can you provide some information on the configuration (nifi.properties)
>>>> of the nodes in your cluster?  Can each node in your cluster ping all the
>>>> other nodes?  Are you running embedded ZooKeeper, or an external one?
>>>>
>>>> On Wed, May 3, 2017 at 8:11 PM Neil Derraugh <
>>>> neil.derraugh@intellifylearning.com> wrote:
>>>>
>>>>> I can't load the canvas right now on our cluster.  I get this error
>>>>> from one of the nodes nifi-app.logs
>>>>>
>>>>> 2017-05-03 23:40:30,207 WARN [Replicate Request Thread-2]
>>>>> o.a.n.c.c.h.r.ThreadPoolRequestReplicator Failed to replicate request
>>>>> GET /nifi-api/flow/current-user to 10.80.53.39:31212 due to {}
>>>>> com.sun.jersey.api.client.ClientHandlerException: java.net
>>>>> .NoRouteToHostException: Host is unreachable (Host unreachable)
>>>>> at com.sun.jersey.client.urlconnection.URLConnectionClientHandl
>>>>> er.handle(URLConnectionClientHandler.java:155)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.api.client.Client.handle(Client.java:652)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.api.client.filter.GZIPContentEncodingFilter.h
>>>>> andle(GZIPContentEncodingFilter.java:123)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:509)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at org.apache.nifi.cluster.coordination.http.replication.Thread
>>>>> PoolRequestReplicator.replicateRequest(ThreadPoolRequestReplicator.java:579)
>>>>> ~[nifi-framework-cluster-1.1.2.jar:1.1.2]
>>>>> at org.apache.nifi.cluster.coordination.http.replication.Thread
>>>>> PoolRequestReplicator$NodeHttpRequest.run(ThreadPoolRequestReplicator.java:771)
>>>>> ~[nifi-framework-cluster-1.1.2.jar:1.1.2]
>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>> [na:1.8.0_121]
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>> [na:1.8.0_121]
>>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>> [na:1.8.0_121]
>>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>> [na:1.8.0_121]
>>>>> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
>>>>> Caused by: java.net.NoRouteToHostException: Host is unreachable (Host
>>>>> unreachable)
>>>>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.Socket.connect(Socket.java:589) ~[na:1.8.0_121]
>>>>> at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.http.HttpClient.New(HttpClient.java:308)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.http.HttpClient.New(HttpClient.java:326)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1202)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:966)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1546)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
>>>>> ~[na:1.8.0_121]
>>>>> at com.sun.jersey.client.urlconnection.URLConnectionClientHandl
>>>>> er._invoke(URLConnectionClientHandler.java:253)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.client.urlconnection.URLConnectionClientHandl
>>>>> er.handle(URLConnectionClientHandler.java:153)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> ... 12 common frames omitted
>>>>> 2017-05-03 23:40:30,207 WARN [Replicate Request Thread-2]
>>>>> o.a.n.c.c.h.r.ThreadPoolRequestReplicator
>>>>> com.sun.jersey.api.client.ClientHandlerException: java.net
>>>>> .NoRouteToHostException: Host is unreachable (Host unreachable)
>>>>> at com.sun.jersey.client.urlconnection.URLConnectionClientHandl
>>>>> er.handle(URLConnectionClientHandler.java:155)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.api.client.Client.handle(Client.java:652)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.api.client.filter.GZIPContentEncodingFilter.h
>>>>> andle(GZIPContentEncodingFilter.java:123)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:509)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at org.apache.nifi.cluster.coordination.http.replication.Thread
>>>>> PoolRequestReplicator.replicateRequest(ThreadPoolRequestReplicator.java:579)
>>>>> ~[nifi-framework-cluster-1.1.2.jar:1.1.2]
>>>>> at org.apache.nifi.cluster.coordination.http.replication.Thread
>>>>> PoolRequestReplicator$NodeHttpRequest.run(ThreadPoolRequestReplicator.java:771)
>>>>> ~[nifi-framework-cluster-1.1.2.jar:1.1.2]
>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>> [na:1.8.0_121]
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>> [na:1.8.0_121]
>>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>> [na:1.8.0_121]
>>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>> [na:1.8.0_121]
>>>>> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
>>>>> Caused by: java.net.NoRouteToHostException: Host is unreachable (Host
>>>>> unreachable)
>>>>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.Socket.connect(Socket.java:589) ~[na:1.8.0_121]
>>>>> at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.http.HttpClient.New(HttpClient.java:308)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.http.HttpClient.New(HttpClient.java:326)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1202)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:966)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1546)
>>>>> ~[na:1.8.0_121]
>>>>> at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
>>>>> ~[na:1.8.0_121]
>>>>> at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
>>>>> ~[na:1.8.0_121]
>>>>> at com.sun.jersey.client.urlconnection.URLConnectionClientHandl
>>>>> er._invoke(URLConnectionClientHandler.java:253)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> at com.sun.jersey.client.urlconnection.URLConnectionClientHandl
>>>>> er.handle(URLConnectionClientHandler.java:153)
>>>>> ~[jersey-client-1.19.jar:1.19]
>>>>> ... 12 common frames omitted
>>>>>
>>>>>
>>>>> GETing the nifi-api's controller/cluster I get the list of happy nodes
>>>>> and then this one (IP address changed to protect the innocent):
>>>>> {
>>>>> nodeId: "7f7c1a9e-faa6-413b-9317-bcec4996cb14",
>>>>> address: "w.x.y.z",
>>>>> apiPort: 31212,
>>>>> status: "CONNECTED",
>>>>> roles: [ ],
>>>>> events: [ ]
>>>>> }
>>>>>
>>>>> No events, no roles.  It says it's connected but has no heartbeat and
>>>>> it's not part of the list of running jobs so far as I can detect.  It's
>>>>> likely a node that had previously been a health member of the cluster.
>>>>>
>>>>> Can anybody help me interpret this?
>>>>>
>>>>> I deleted the node and am carrying on as usual.  Just wondering if
>>>>> anyone has any insight into why it would leave the node in the cluster
and
>>>>> show it as connected.
>>>>>
>>>>> Thanks,
>>>>> Neil
>>>>>
>>>>
>>>
>

Mime
View raw message