geode-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruce Schuchardt <bschucha...@pivotal.io>
Subject Re: Pulse is giving stale view of cluster -- lost updates
Date Tue, 19 Feb 2019 22:47:25 GMT
I can't comment on most of the content of your server1.log.  The 
java.net.SocketException doesn't seem to be causing any problems but an 
internet search indicated that setting

|-Djava.net.preferIPv4Stack=true|

might fix that problem for the machine you're using for testing. This 
exception is caught and logged but shouldn't cause any other problems.  
Indeed, I can see from the debug-level logging that UDP messaging was 
working okay in your run.


On 2/16/19 6:42 AM, Dharam Thacker wrote:
> Hi Team,
>
> I am sure about this issue now and it's really critical and worth to 
> look at. I would really appreciate to address it in upcoming release 
> as it's a BLOCKER for monitoring systems.
>
> I hope below one helps for your analysis. Please let me know if I can 
> help with any more details for the same.
>
> Few quick glimpses
> On startup>>
> [debug 2019/02/16 19:41:45.642 IST <main> tid=0x1] Creating  
> Management Region :
>
> [debug 2019/02/16 19:41:45.680 IST <main> tid=0x1] Management Service 
> is not initialised hence returning from handleLockServiceCreation
>
> [warn 2019/02/16 19:41:46.500 IST <main> tid=0x1] Could not initialize 
> class org.apache.logging.log4j.util.PropertiesUtil
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.logging.log4j.util.PropertiesUtil
>
> ...
>
> *_System Specification : _*
> DISTRIB_ID=LinuxMint
> DISTRIB_RELEASE=18.3
> DISTRIB_CODENAME=sylvia
> DISTRIB_DESCRIPTION="Linux Mint 18.3 Sylvia"
>
> *_Java : _*
> openjdk version "1.8.0_191"
> OpenJDK Runtime Environment (build 
> 1.8.0_191-8u191-b12-2ubuntu0.16.04.1-b12)
> OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
>
> *_GEODE_*: 1.8.0 *_Spring-Data-Geode_* : 2.1.4.RELEASE (Geode version 
> overriden from 1.6.0 to 1.8.0)
>
> John,
> It's fully using spring-data-geode and worth looking at several issues 
> related to that as well in server1.log
>
> The below link contains following artifacts for detailed analysis and 
> re-generating issues,
>
> *_Attachments:_*
> https://drive.google.com/open?id=18AuPx05Aw-ezwNOKqdCfUJUwUycOqzTp 
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__drive.google.com_open-3Fid-3D18AuPx05Aw-2DezwNOKqdCfUJUwUycOqzTp&d=DwMFaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=JEKigqAv3f2lWHmA02pq9MDT5naXLkEStB4d4n0NQmk&m=KXGpMQ3vCHLW9I1372frvIz29jAVik7VeZ19pSYqNjU&s=Encm7VMywtgfrNZoO_gucw4q4RwpZmlQ3xpowDLpiNY&e=>
>
> 1. I have attached both locator (locator1,locator2) logs & properties file
> *_Commands:_*
> start locator --name=locator1 --port=10334 
> --properties-file=/home/apps/work/geode/locator1/locator.properties 
> --dir=/home/apps/work/geode/locator1/work
>
> start locator --name=locator2 --port=10335 
> --properties-file=/home/apps/work/geode/locator2/locator.properties 
> --dir=/home/apps/work/geode/locator2/work
>
> 2. I have attached server1.log with debug level & demo.tar to 
> regenerate the same issue
> *_Command_* : java -jar demo-0.0.1-SNAPSHOT.jar --demo.name 
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__demo.name&d=DwMFaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=JEKigqAv3f2lWHmA02pq9MDT5naXLkEStB4d4n0NQmk&m=KXGpMQ3vCHLW9I1372frvIz29jAVik7VeZ19pSYqNjU&s=onawLLFvihcvBxkdkVHzB5jqnL6Cy1UmlVfSy1I7KMQ&e=>=S1

> --demo.port=40441 > server1.log &
>
> 3. Below is the pulse view where we can clearly say that, no more JMX 
> notifications regarding region initialisation or cache server were 
> recorded
>
> image.png
>
> Thanks,
> Dharam
>
>
> I have
> - Dharam Thacker
>
>
> On Tue, Feb 5, 2019 at 6:28 PM Thacker, Dharam 
> <dharam.thacker@jpmorgan.com <mailto:dharam.thacker@jpmorgan.com>> wrote:
>
>     Hi Team,
>
>     I have usually seen following sequence when new member joins the
>     cluster (member = cache-server)
>
>     *_JMX Notifications on pulse screen :_*
>
>     1.Member Joined <<SERVER_NAME>>
>
>     2.Region Created With Name /<<REGION_NAME>>
>
>     3.Cache Server is Started in the VM
>
>     I am using GEODE 1.8.0  + Spring data geode 2.1.4.RELEASE with
>     following properties and pulse in embedded mode.
>
>     *_locator1.properties_*
>
>     locators=dharam-thakkar[10440],dharam-thakkar[10440]
>
>     mcast-port=0
>
>     jmx-manager=true
>
>     jmx-manager-start=true
>
>     jmx-manager-port=1091
>
>     jmx-manager-ssl-enabled=false
>
>     jmx-manager-bind-address=dharam-thakkar
>
>     enable-network-partition-detection=false
>
>     http-service-port=9701
>
>     http-service-bind-address=dharam-thakkar
>
>     log-file=/local/var/tmp/demo-locator1/locator1.log
>
>     log-file-size-limit=10
>
>     log-level=config
>
>     log-disk-space-limit=50
>
>     I tried below sequence and I see that PULSE is missing “JMX
>     Notifications” and gives incorrect view of cluster.
>
>     *_Steps to reproduce>>_*
>
>     1.gfsh start locator --name=demo-locator-1 --port=10440
>     --properties-file=locator1.properties
>     --work-dir=/var/tmp/demo-locator1/work
>
>     2.java -jar demo-spring-boot-geode-server.jar
>     -DserverName=demo-server1 -DserverPort=40440
>
>     3.java -jar demo-spring-boot-geode-server.jar
>     -DserverName=demo-server2 -DserverPort=40441
>
>     4.Everything will look fine as of now and you will see all
>     notifications as explained in above sequence
>
>     5.PID=`ps auxwww | fgrep 'java' | fgrep 'demo-server-1' | awk
>     '{print $2}'` ; kill -INT $PID
>
>     6.You should see *“Member Departed <<SERVER_NAME>>”* message on pulse
>
>     7.Reboot the member -- java -jar demo-spring-boot-geode-server.jar
>     -DserverName=demo-server1 -DserverPort=40440
>
>     8.Observe pulse notifications and member count
>
>     9.You will only see *“Member Joined <<SERVER_NAME>>” * message on
>     pulse and no update in member count
>
>     10.If you don’t see situation as step-9, repeat steps-5 to steps-7
>     few times and you will end up in this situation
>
>     *_Note:_* Please note that GFSH shows everything correctly but
>     PULSE has issues.
>
>     Thanks,
>
>     Dharam
>
>     This message is confidential and subject to terms at:
>     https://www.jpmorgan.com/emaildisclaimer
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.jpmorgan.com_emaildisclaimer&d=DwMFaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=JEKigqAv3f2lWHmA02pq9MDT5naXLkEStB4d4n0NQmk&m=KXGpMQ3vCHLW9I1372frvIz29jAVik7VeZ19pSYqNjU&s=wnMQ4KQ6EkepwerGG8L-HD4Bkb64Lv6lIQ77fjYolzs&e=>
>     including on confidentiality, legal privilege, viruses and
>     monitoring of electronic messages. If you are not the intended
>     recipient, please delete this message and notify the sender
>     immediately. Any unauthorized use is strictly prohibited.
>

Mime
View raw message