geode-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Darrel Schneider <dschnei...@pivotal.io>
Subject Re: Some Geode management metrics returning 0s after OS upgrade
Date Tue, 12 Feb 2019 16:08:26 GMT
You need to actually be executing functions and doing region puts/gets for
these stats to be non-zero.
The gfs files record the gets/puts in the CachePerfStats. One
CachePerfStats represents a combination of all the regions. Other
CachePerfStats represent just one region (those have that region name on
them). You want to look at the first one as it represents the entire cache.


On Tue, Feb 12, 2019 at 6:43 AM Vahram Aharonyan <vaharonyan@vmware.com>
wrote:

> Hi All,
>
>
>
> Experiments with various experiments and long-term monitoring showed that
> the only real problem remains only with these 3 metrics:
>
>
>
> org.apache.geode.management.MemberMXBean#getFunctionExecutionRate
>
> org.apache.geode.management.MemberMXBean#getPutsRate
>
> org.apache.geode.management.MemberMXBean#getGetsRate
>
>
>
> All others related to either Network or Disk have some values differing
> from 0, but these three constantly have 0-values. These seem to be
> Geode-internal metrics and should not be related to system right? Could it
> be that there is some info on these metrics in *.gfs files, so we can see
> whether they have actual values or not?
>
>
>
> Thanks,
>
> Vahram.
>
>
>
> *From:* Vahram Aharonyan <vaharonyan@vmware.com>
> *Sent:* Thursday, February 7, 2019 5:19 PM
> *To:* user@geode.apache.org
> *Subject:* RE: Some Geode management metrics returning 0s after OS upgrade
>
>
>
> Hi Kirk,
>
>
>
> We were not able to find any erroneous message from StatsSampler in our
> log files.
>
> Is running of these tests straightforward, do we have some doc describing
> this process? What kind of requirements should be met to be able to run
> this test?
>
>
>
> Hi Barry,
>
>
>
> Yes, we see values for other MBean attributes reported.
>
>
>
> You were right, thread is there:
>
> INFO   | jvm 1    | 2019/02/07 12:15:54 | "Thread-10 StatSampler" #59
> daemon prio=10 os_prio=0 tid=0x00007f1fc8951800 nid=0x2d0 in Object.wait()
> [0x00007f1fb14e3000]
>
> INFO   | jvm 1    | 2019/02/07 12:15:54 |    java.lang.Thread.State:
> TIMED_WAITING (on object monitor)
>
> INFO   | jvm 1    | 2019/02/07 12:15:54 |       at
> java.lang.Object.wait(Native Method)
>
> INFO   | jvm 1    | 2019/02/07 12:15:54 |       at
> org.apache.geode.internal.statistics.HostStatSampler.delay(HostStatSampler.java:520)
>
> INFO   | jvm 1    | 2019/02/07 12:15:54 |       - locked
> <0x0000000651581a68> (a
> org.apache.geode.internal.statistics.GemFireStatSampler)
>
> INFO   | jvm 1    | 2019/02/07 12:15:54 |       at
> org.apache.geode.internal.statistics.HostStatSampler.run(HostStatSampler.java:208)
>
> INFO   | jvm 1    | 2019/02/07 12:15:54 |       at
> java.lang.Thread.run(Thread.java:748)
>
>
>
> Could it be that this is caused by missing some privileges to access
> system resources ? Or is there some way to check if this information is
> available in the *.gfs stat files from locator or server? I was looking
> into these files but was not able to find anything linking me with
> below-mentioned metrics.
>
>
>
> Thanks,
>
> Vahram.
>
>
>
> *From:* Barry Oglesby <boglesby@pivotal.io>
> *Sent:* Wednesday, February 6, 2019 11:21 PM
> *To:* user@geode.apache.org
> *Subject:* Re: Some Geode management metrics returning 0s after OS upgrade
>
>
>
> Do you see values for other MBean attributes?
>
>
>
> If you do a thread dump in your server JVM(s), you should see a thread
> like this running:
>
>
>
> "StatSampler" #39 daemon prio=10 os_prio=31 tid=0x00007fdcbf004000
> nid=0x7003 in Object.wait() [0x000070000c50a000]
>
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>
>                 at java.lang.Object.wait(Native Method)
>
>                 at
> org.apache.geode.internal.statistics.HostStatSampler.delay(HostStatSampler.java:519)
>
>                 - locked <0x00000007a8911160> (a
> org.apache.geode.internal.statistics.GemFireStatSampler)
>
>                 at
> org.apache.geode.internal.statistics.HostStatSampler.run(HostStatSampler.java:219)
>
>                 at java.lang.Thread.run(Thread.java:745)
>
>
>
>
>
>
>
> On Wed, Feb 6, 2019 at 9:40 AM Kirk Lund <klund@apache.org> wrote:
>
> Phantom OS might have caused the StatSampler to fail or even crash. That's
> the only explanation I can think of that might result in the non-OS related
> stats remaining zero. You might want to look through the log to see if the
> StatSampler logged any problems. Other than that, you could try running
> every statistic related test/integrationTest/distributedTest in Geode on
> Phantom OS to see how the tests behave.
>
>
>
> On Wed, Feb 6, 2019 at 7:49 AM Anthony Baker <abaker@pivotal.io> wrote:
>
> I wouldn’t be surprised if other OS -related things are broken on Phantom
> OS as well.  We use JNA for most native calls.  Look at `git grep
> Native.register` to see what posix-like things might be affected.
>
>
>
> Anthony
>
>
>
>
>
> On Feb 6, 2019, at 7:28 AM, Jacob Barrett <jbarrett@pivotal.io> wrote:
>
>
>
> We don’t have any hooks into the stats for this OS.
>
>
> On Feb 6, 2019, at 7:16 AM, Jens Deppe <jdeppe@pivotal.io> wrote:
>
> From SLES 11 to Phantom OS
>
>
>
> (I had already asked asked, but my CC got scrambled :( )
>
>
>
> On Wed, Feb 6, 2019 at 7:10 AM Anthony Baker <abaker@pivotal.io> wrote:
>
> Which OS did you upgrade to?
>
>
>
> Anthony
>
>
>
> On Feb 6, 2019, at 1:25 AM, Vahram Aharonyan <vaharonyan@vmware.com>
> wrote:
>
>
>
> Hi All,
>
>
>
> For our troubleshooting purposes we have been collecting some data from
> Geode cluster members using following APIs:
>
>
>
> org.apache.geode.management.MemberMXBean#getFunctionExecutionRate
>
> org.apache.geode.management.MemberMXBean#getPutsRate
>
> org.apache.geode.management.MemberMXBean#getGetsRate
>
>
>
> org.apache.geode.management.NetworkMetrics#getBytesReceivedRate
>
> org.apache.geode.management.NetworkMetrics#getBytesSentRate
>
>
>
> org.apache.geode.management.DiskMetrics#getDiskFlushAvgLatency
>
> org.apache.geode.management.DiskMetrics#getDiskReadsRate
>
> org.apache.geode.management.DiskMetrics#getDiskWritesRate
>
>
>
> Recently we have replaced our base OS and all the values reported back by
> Geode during this calls become 0s.
>
> Could someone help us to understand how these metrics are being collected
> by Geode? Could it be that Geode uses some system utilities or system calls
> that existed in our previous appliance and are removed in our newer version
> of system causing Geode returning only 0s.
>
>
>
> Thanks,
>
> Vahram.
>
>
>
>
>
>

Mime
View raw message