geode-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dharam Thacker <dharamthacke...@gmail.com>
Subject Re: Pulse is giving stale view of cluster -- lost updates
Date Sat, 16 Feb 2019 14:42:07 GMT
Hi Team,

I am sure about this issue now and it's really critical and worth to look
at. I would really appreciate to address it in upcoming release as it's a
BLOCKER for monitoring systems.

I hope below one helps for your analysis. Please let me know if I can help
with any more details for the same.

Few quick glimpses
On startup>>
[debug 2019/02/16 19:41:45.642 IST <main> tid=0x1] Creating  Management
Region :

[debug 2019/02/16 19:41:45.680 IST <main> tid=0x1] Management Service is
not initialised hence returning from handleLockServiceCreation

[warn 2019/02/16 19:41:46.500 IST <main> tid=0x1] Could not initialize
class org.apache.logging.log4j.util.PropertiesUtil
java.lang.NoClassDefFoundError: Could not initialize class
org.apache.logging.log4j.util.PropertiesUtil

...

*System Specification : *
DISTRIB_ID=LinuxMint
DISTRIB_RELEASE=18.3
DISTRIB_CODENAME=sylvia
DISTRIB_DESCRIPTION="Linux Mint 18.3 Sylvia"

*Java : *
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.16.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

*GEODE*: 1.8.0 *Spring-Data-Geode* : 2.1.4.RELEASE (Geode version overriden
from 1.6.0 to 1.8.0)

John,
It's fully using spring-data-geode and worth looking at several issues
related to that as well in server1.log

The below link contains following artifacts for detailed analysis and
re-generating issues,

*Attachments:*
https://drive.google.com/open?id=18AuPx05Aw-ezwNOKqdCfUJUwUycOqzTp

1. I have attached both locator (locator1,locator2) logs & properties file
*Commands:*
start locator --name=locator1 --port=10334
--properties-file=/home/apps/work/geode/locator1/locator.properties
--dir=/home/apps/work/geode/locator1/work

start locator --name=locator2 --port=10335
--properties-file=/home/apps/work/geode/locator2/locator.properties
--dir=/home/apps/work/geode/locator2/work

2. I have attached server1.log with debug level & demo.tar to regenerate
the same issue
*Command* : java -jar demo-0.0.1-SNAPSHOT.jar --demo.name=S1
--demo.port=40441 > server1.log &

3. Below is the pulse view where we can clearly say that, no more JMX
notifications regarding region initialisation or cache server were recorded

[image: image.png]

Thanks,
Dharam


I have
- Dharam Thacker


On Tue, Feb 5, 2019 at 6:28 PM Thacker, Dharam <dharam.thacker@jpmorgan.com>
wrote:

> Hi Team,
>
>
>
> I have usually seen following sequence when new member joins the cluster
> (member = cache-server)
>
>
>
> *JMX Notifications on pulse screen :*
>
>
>
> 1.       Member Joined <<SERVER_NAME>>
>
>
>
> 2.       Region Created With Name /<<REGION_NAME>>
>
>
>
> 3.       Cache Server is Started in the VM
>
>
>
> I am using GEODE 1.8.0  + Spring data geode 2.1.4.RELEASE with following
> properties and pulse in embedded mode.
>
>
>
> *locator1.properties*
>
> locators=dharam-thakkar[10440],dharam-thakkar[10440]
>
> mcast-port=0
>
> jmx-manager=true
>
> jmx-manager-start=true
>
> jmx-manager-port=1091
>
> jmx-manager-ssl-enabled=false
>
> jmx-manager-bind-address=dharam-thakkar
>
> enable-network-partition-detection=false
>
> http-service-port=9701
>
> http-service-bind-address=dharam-thakkar
>
> log-file=/local/var/tmp/demo-locator1/locator1.log
>
> log-file-size-limit=10
>
> log-level=config
>
> log-disk-space-limit=50
>
>
>
> I tried below sequence and I see that PULSE is missing “JMX Notifications”
> and gives incorrect view of cluster.
>
>
>
> *Steps to reproduce>>*
>
>
>
> 1.       gfsh start locator --name=demo-locator-1 --port=10440
> --properties-file=locator1.properties --work-dir=/var/tmp/demo-locator1/work
>
>
>
> 2.       java -jar demo-spring-boot-geode-server.jar
> -DserverName=demo-server1 -DserverPort=40440
>
>
>
> 3.       java -jar demo-spring-boot-geode-server.jar
> -DserverName=demo-server2 -DserverPort=40441
>
>
>
> 4.       Everything will look fine as of now and you will see all
> notifications as explained in above sequence
>
>
>
> 5.       PID=`ps auxwww | fgrep 'java' | fgrep 'demo-server-1' | awk
> '{print $2}'` ; kill -INT $PID
>
>
>
> 6.       You should see *“Member Departed <<SERVER_NAME>>”* message on
> pulse
>
>
>
> 7.       Reboot the member -- java -jar demo-spring-boot-geode-server.jar
> -DserverName=demo-server1 -DserverPort=40440
>
>
>
> 8.       Observe pulse notifications and member count
>
>
>
> 9.       You will only see *“Member Joined <<SERVER_NAME>>” * message
on
> pulse and no update in member count
>
>
>
> 10.   If you don’t see situation as step-9, repeat steps-5 to steps-7 few
> times and you will end up in this situation
>
>
>
> *Note:* Please note that GFSH shows everything correctly but PULSE has
> issues.
>
>
>
> Thanks,
>
> Dharam
>
> This message is confidential and subject to terms at: https://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>

Mime
View raw message