geode-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Deppe <jensde...@apache.org>
Subject Re: Pulse is giving stale view of cluster -- lost updates
Date Wed, 20 Feb 2019 05:23:12 GMT
Hi Dharam,

I've tried to replicate this, but have not been successful - I've tried
restarting my Spring Boot app server at least 20 times, but it always shows
up in Pulse.

What would be useful is to try and look at the data that Pulse is
retrieving in order to update it's display. If you're using Chrome, can you
open the developer console and select the 'Network' tab. From there, select
the 'XHR' filter tab - that should show you a 'pulseUpdate'
request/response every 5 seconds. I'd be interested to see the data (it's a
JSON payload) that comes back when you have all the members in the view and
then the data that comes back when you are missing a member.

Thanks
--Jens

On Tue, Feb 19, 2019 at 2:47 PM Bruce Schuchardt <bschuchardt@pivotal.io>
wrote:

> I can't comment on most of the content of your server1.log.  The
> java.net.SocketException doesn't seem to be causing any problems but an
> internet search indicated that setting
>
> -Djava.net.preferIPv4Stack=true
>
> might fix that problem for the machine you're using for testing.  This
> exception is caught and logged but shouldn't cause any other problems.
> Indeed, I can see from the debug-level logging that UDP messaging was
> working okay in your run.
>
>
> On 2/16/19 6:42 AM, Dharam Thacker wrote:
>
> Hi Team,
>
> I am sure about this issue now and it's really critical and worth to look
> at. I would really appreciate to address it in upcoming release as it's a
> BLOCKER for monitoring systems.
>
> I hope below one helps for your analysis. Please let me know if I can help
> with any more details for the same.
>
> Few quick glimpses
> On startup>>
> [debug 2019/02/16 19:41:45.642 IST <main> tid=0x1] Creating  Management
> Region :
>
> [debug 2019/02/16 19:41:45.680 IST <main> tid=0x1] Management Service is
> not initialised hence returning from handleLockServiceCreation
>
> [warn 2019/02/16 19:41:46.500 IST <main> tid=0x1] Could not initialize
> class org.apache.logging.log4j.util.PropertiesUtil
> java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.logging.log4j.util.PropertiesUtil
>
> ...
>
> *System Specification : *
> DISTRIB_ID=LinuxMint
> DISTRIB_RELEASE=18.3
> DISTRIB_CODENAME=sylvia
> DISTRIB_DESCRIPTION="Linux Mint 18.3 Sylvia"
>
> *Java : *
> openjdk version "1.8.0_191"
> OpenJDK Runtime Environment (build
> 1.8.0_191-8u191-b12-2ubuntu0.16.04.1-b12)
> OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
>
> *GEODE*: 1.8.0 *Spring-Data-Geode* : 2.1.4.RELEASE (Geode version
> overriden from 1.6.0 to 1.8.0)
>
> John,
> It's fully using spring-data-geode and worth looking at several issues
> related to that as well in server1.log
>
> The below link contains following artifacts for detailed analysis and
> re-generating issues,
>
> *Attachments:*
> https://drive.google.com/open?id=18AuPx05Aw-ezwNOKqdCfUJUwUycOqzTp
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__drive.google.com_open-3Fid-3D18AuPx05Aw-2DezwNOKqdCfUJUwUycOqzTp&d=DwMFaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=JEKigqAv3f2lWHmA02pq9MDT5naXLkEStB4d4n0NQmk&m=KXGpMQ3vCHLW9I1372frvIz29jAVik7VeZ19pSYqNjU&s=Encm7VMywtgfrNZoO_gucw4q4RwpZmlQ3xpowDLpiNY&e=>
>
> 1. I have attached both locator (locator1,locator2) logs & properties file
> *Commands:*
> start locator --name=locator1 --port=10334
> --properties-file=/home/apps/work/geode/locator1/locator.properties
> --dir=/home/apps/work/geode/locator1/work
>
> start locator --name=locator2 --port=10335
> --properties-file=/home/apps/work/geode/locator2/locator.properties
> --dir=/home/apps/work/geode/locator2/work
>
> 2. I have attached server1.log with debug level & demo.tar to regenerate
> the same issue
> *Command* : java -jar demo-0.0.1-SNAPSHOT.jar --demo.name
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__demo.name&d=DwMFaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=JEKigqAv3f2lWHmA02pq9MDT5naXLkEStB4d4n0NQmk&m=KXGpMQ3vCHLW9I1372frvIz29jAVik7VeZ19pSYqNjU&s=onawLLFvihcvBxkdkVHzB5jqnL6Cy1UmlVfSy1I7KMQ&e=>=S1
> --demo.port=40441 > server1.log &
>
> 3. Below is the pulse view where we can clearly say that, no more JMX
> notifications regarding region initialisation or cache server were recorded
>
> [image: image.png]
>
> Thanks,
> Dharam
>
>
> I have
> - Dharam Thacker
>
>
> On Tue, Feb 5, 2019 at 6:28 PM Thacker, Dharam <
> dharam.thacker@jpmorgan.com> wrote:
>
>> Hi Team,
>>
>>
>>
>> I have usually seen following sequence when new member joins the cluster
>> (member = cache-server)
>>
>>
>>
>> *JMX Notifications on pulse screen :*
>>
>>
>>
>> 1.       Member Joined <<SERVER_NAME>>
>>
>>
>>
>> 2.       Region Created With Name /<<REGION_NAME>>
>>
>>
>>
>> 3.       Cache Server is Started in the VM
>>
>>
>>
>> I am using GEODE 1.8.0  + Spring data geode 2.1.4.RELEASE with following
>> properties and pulse in embedded mode.
>>
>>
>>
>> *locator1.properties*
>>
>> locators=dharam-thakkar[10440],dharam-thakkar[10440]
>>
>> mcast-port=0
>>
>> jmx-manager=true
>>
>> jmx-manager-start=true
>>
>> jmx-manager-port=1091
>>
>> jmx-manager-ssl-enabled=false
>>
>> jmx-manager-bind-address=dharam-thakkar
>>
>> enable-network-partition-detection=false
>>
>> http-service-port=9701
>>
>> http-service-bind-address=dharam-thakkar
>>
>> log-file=/local/var/tmp/demo-locator1/locator1.log
>>
>> log-file-size-limit=10
>>
>> log-level=config
>>
>> log-disk-space-limit=50
>>
>>
>>
>> I tried below sequence and I see that PULSE is missing “JMX
>> Notifications” and gives incorrect view of cluster.
>>
>>
>>
>> *Steps to reproduce>>*
>>
>>
>>
>> 1.       gfsh start locator --name=demo-locator-1 --port=10440
>> --properties-file=locator1.properties --work-dir=/var/tmp/demo-locator1/work
>>
>>
>>
>> 2.       java -jar demo-spring-boot-geode-server.jar
>> -DserverName=demo-server1 -DserverPort=40440
>>
>>
>>
>> 3.       java -jar demo-spring-boot-geode-server.jar
>> -DserverName=demo-server2 -DserverPort=40441
>>
>>
>>
>> 4.       Everything will look fine as of now and you will see all
>> notifications as explained in above sequence
>>
>>
>>
>> 5.       PID=`ps auxwww | fgrep 'java' | fgrep 'demo-server-1' | awk
>> '{print $2}'` ; kill -INT $PID
>>
>>
>>
>> 6.       You should see *“Member Departed <<SERVER_NAME>>”* message
on
>> pulse
>>
>>
>>
>> 7.       Reboot the member -- java -jar
>> demo-spring-boot-geode-server.jar -DserverName=demo-server1
>> -DserverPort=40440
>>
>>
>>
>> 8.       Observe pulse notifications and member count
>>
>>
>>
>> 9.       You will only see *“Member Joined <<SERVER_NAME>>” * message
on
>> pulse and no update in member count
>>
>>
>>
>> 10.   If you don’t see situation as step-9, repeat steps-5 to steps-7
>> few times and you will end up in this situation
>>
>>
>>
>> *Note:* Please note that GFSH shows everything correctly but PULSE has
>> issues.
>>
>>
>>
>> Thanks,
>>
>> Dharam
>>
>> This message is confidential and subject to terms at: https://
>> www.jpmorgan.com/emaildisclaimer
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.jpmorgan.com_emaildisclaimer&d=DwMFaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=JEKigqAv3f2lWHmA02pq9MDT5naXLkEStB4d4n0NQmk&m=KXGpMQ3vCHLW9I1372frvIz29jAVik7VeZ19pSYqNjU&s=wnMQ4KQ6EkepwerGG8L-HD4Bkb64Lv6lIQ77fjYolzs&e=>
>> including on confidentiality, legal privilege, viruses and monitoring of
>> electronic messages. If you are not the intended recipient, please delete
>> this message and notify the sender immediately. Any unauthorized use is
>> strictly prohibited.
>>
>

Mime
View raw message