ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Denis Chudov (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-12070) Document the new ability to track system/user time of transactions
Date Wed, 14 Aug 2019 12:57:00 GMT
Denis Chudov created IGNITE-12070:

             Summary: Document the new ability to track system/user time of transactions
                 Key: IGNITE-12070
                 URL: https://issues.apache.org/jira/browse/IGNITE-12070
             Project: Ignite
          Issue Type: Task
          Components: documentation
            Reporter: Denis Chudov

Now there is ability to track system/user time of transactions. System time is the time that
is spent for system activities - i.e. time while aquiring locks, preparing, commiting, etc.User
time is the time that is spent for user activities when client node runs some code while holding

We have ability to log info about transactions that exceed some threshold execution timeout,
or some percentage of all transactions. Log record in case of long-running transactions looks
like following:
[2019-08-09 13:39:49,130][WARN ][sys-stripe-1-#101%client%][root] Long transaction time dump
[startTime=13:39:47.970, totalTime=1160, systemTime=157, userTime=1003, cacheOperationsTime=141,
prepareTime=15, commitTime=0, tx=GridNearTxLocal [...]]
In case of sampling of all transactions:

[2019-08-09 13:39:54,079][INFO ][sys-stripe-2-#102%client%][root] Transaction time dump [startTime=13:39:54.063,
totalTime=15, systemTime=6, userTime=9, cacheOperationsTime=2, prepareTime=3, commitTime=0,
tx=GridNearTxLocal [...]]
Also some of transactions can be skipped to not overflow the log, information about this log
throttling looks like this:
[2019-08-09 13:39:55,109][INFO ][sys-stripe-0-#100%client%][root] Transaction time dumps skipped
because of log throttling: 2
There are JMX parameters and JVM options to control this behavior:
JMX parameter: TransactionsMXBean.longTransactionTimeDumpThreshold
Threshold timeout in milliseconds for long transactions, if transaction exceeds it, it will
be dumped in log with information about how much time did it spent in system time and user
time. Default value is 0. No info about system/user time of long transactions is dumped in
log if this parameter is not set.
JMX parameter: TransactionsMXBean.transactionTimeDumpSamplesCoefficient
The coefficient for samples of completed transactions that will be dumped in log. Must be
float value between 0.0 and 1.0 inclusive. Default value is 0.0.
JMX parameter: TransactionsMXBean.transactionTimeDumpSamplesPerSecondLimit
The limit of samples of completed transactions that will be dumped in log per second, if IGNITE_TRANSACTION_TIME_DUMP_SAMPLES_COEFFICIENT
is above 0.0. Must be integer value greater than 0. Default value is 5.

For the existing long running transaction warning was added information about current system
and user time of transaction:
[2019-08-09 14:10:31,835][WARN ][grid-timeout-worker-#122%client%][root] First 10 long running
transactions [total=1]
[2019-08-09 14:10:31,835][WARN ][grid-timeout-worker-#122%client%][root] >>> Transaction
[startTime=14:10:31.170, curTime=14:10:31.750, systemTime=32, userTime=548, tx=GridNearTxLocal

Also added following metrics to monitor system and user time for single node:
diagnostic.transactions.totalNodeSystemTime - Total transactions system time on node.
diagnostic.transactions.totalNodeUserTime - Total transactions user time on node.
diagnostic.transactions.nodeSystemTimeHistogram - Transactions system times on node represented
as histogram.
diagnostic.transactions.nodeUserTimeHistogram - Transactions user times on node represented
as histogram.

This message was sent by Atlassian JIRA

View raw message