incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bobrovsky, Konstantin S" <>
Subject RE: Snapshot Design Part II
Date Tue, 09 Feb 2010 09:09:01 GMT
[ Hope I'm not too late for the discussion ]

There is quite broadly used "prior art" in this field called "BTrace" and "DTrace. Some ideas
could probably be borrowed for the Snapshot API.

First a few links:
- BTrace docs:

- JavaOne2009 article from Ken Sipe "Debugging Your Production JVM" on using Btrace (and lots
of other tools) for diagnosing live JVM/application problems:

- DTrace docs:

Second, here is a short intro to BTrace.

BTrace is basically 3 things:
(1) an in-JVM agent which "executes" monitoring requests in the context of the live JVM the
agent is embedded in

(2) a user-side client which translates BTrace scripts into monitoring queries, sends them
over to the agent and captures the reply

(3) BTrace scripts which are normal java programs (using BTrace libraries) which use annotations
to construct the "what and when/where" specification and use normal java code to specify what
to do over "what" when "when" condition is met in the "where" code.

As I can see from the user's guide, the agent actually instruments application code by injecting
user script's java code into classes/methods satisfying the "when" criteria. I guess this
might be done via JVMTI's class/method redefinition functionality - i.e. in a JVM-neutral
way. But some features (like intercepting object allocations) are probably Hotspot-specific
as require non-standard (non-JVMTI) JVM support.

What can be borrowed

1) The set of "when" specifications:

- OnMethod. This annotation specifies a BTrace probe point by specifying a java class (or
classes), a method (or methods in it) and a specific location within it. A BTrace trace action
method annotated by this annotation is called when matching the traced program reaches the
specified location. [ The most powerful one - Konst ]

- OnTimer. annotation can be used to specify tracing actions that have to run periodically
once every N milliseconds.

- OnEvent. annotation is used to associate tracing methods with "external" events send by
BTrace client. BTrace methods annotated by this annotation are called when BTrace client sends
an "event". Client may send an event based on some form of user request to send (like pressing
Ctrl-C or a GUI menu).

- OnLowMemory
- OnProbe. Annotation can be used to specify to avoid using implementation internal classes
in BTrace scripts. @OnProbe probe specifications are mapped to one or more @OnMethod specifications
by the BTrace VM agent. Currently, this mapping is done using a XML probe descriptor file

2) The set of "what" specifications.

These are annotations allowing to refer to various method arguments, return values, etc, which
are defined in terms of a called method bearing @OnMethod annotation. Here is an example illustrating
this (Copyright 2008 Sun Microsystems, Inc.  All Rights Reserved. See full source with copyright

@BTrace public class Classload {
   public static void defineclass(@Return Class cl) {
       println(strcat("loaded ", name(cl)));

This BTrace script dumps current stack whenever a new class is loaded via java.lang.ClassLoader::defineClass.
This is done by the method 'defineclass', which is injected into live JVM and which gets the
return value of the defineClass as its incoming argument (Class cl).

3) Implementation strategy. I.e. Snapshot API VM-side RI could be implemented as JVMTI agent
which injects code into a live JVM and also uses Hotspot-specific DTrace-based features like
tracing object allocations without (AFAIU) the Hotspot copyright.

4) DTrace-based Hotspot-specific features. These are additional "when" specifications:
- vm init/shutdown
- thread start/stop
- class load/unload
- gc begin/end
- mem pool begin/end
- method compile begin/end
- compiled method load/unload
- a variety of monitor events
- method endtry/exit
- object allocation
- (any) JNI function interception

Intel Novosibirsk
Closed Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park, 
17 Krylatskaya Str., Bldg 4, Moscow 121614, 
Russian Federation

>-----Original Message-----
>From: Steve Poole []
>Sent: Monday, January 25, 2010 9:53 PM
>Subject: Snapshot Design Part II
>Let's explore the basic outline of this API.
>1 - It is a VM side API. That is  - it is active on the running system.
>2 - It is declarative.  That means , like SQL,  OQL or similar query
>languages it is an API where the user describes what they want to have
>happen. There are no call backs into user code for selection purposes.
>reasoning behind this is:
>A) At the point where this collection process is triggered then JVM may be
>in poor state - at least where running Java is concerned.
>B) A declarative form allows the JVM vendor to implement their solution at
>what ever level they choose.
>C) Similarly,  larger execution optimisations can be made by the JVM.
>D) Having selection criteria written in Java could result in queries
>altering that which is being queried. Ignoring the inefficiencies, there is
>a risk there could be an infinite loop or deadlock.
>3 - It is dynamic - in that the definition can generally be changed up
>the time when the collection is triggered.  It is at least additive in that
>applications may wish to register their specific selections to be dumped
>these selections could be mutually exclusive.  In the event of having an
>and Not B"  + "B and Not A" situation the API must resolve this into "A and
>4 - Multiple instances of the snapshot definition can be created and in
>progress at the same time.
>5 - Definitions  have a mechanism to allow then to define when they would
>triggered. This would cover particular failure events such as exceptions.
>6 -  There would be some concept of a default snapshot that would be
>triggered by the JVM on a failing condition such as Out of Memory.
>7 - The selection process  that chooses what would be in the dump has to
>have at least three component parts
>A) A way to define a starting point - this could be a starting object,
>class,  thread , class loader or even the heap.
>B) A way to define what should be collected and at what level of
>representation. By package, by classloader ,  matching super/subclasses,
>implements an interface etc.   When reporting an object what gets reported
>all fields, object references (ie some unique id)  , array sizes , array
>contents etc?
>C) A way to define range and direction.  Consider whats happens if you
>wanted to get all objects of a type that were contained in a Map.   At the
>API level a Map is a single idea: at the implementation level its a
>collection of objects.  When searching for an instance the search needs to
>either have an understanding of logical structures or just be constrained
>a number of hops in navigating object relationships.  Maybe both.  Consider
>also if you wanted to dump all the threads, their stacks and list the
>references (unique ids) they contain.   That's a different axis to the
>the heap" process.
>8 - The API should probably cater for the situation where the selection
>requirements need to be provided to the JVM on start up.  This may be due
>performance issues or because we identify an entity or situation that can
>only be reached during startup.  I don't have an example at this point but
>do want to mention the possibility.
>9) Execution time performance of this API is critical  - the design must
>offer the implementer the option of ahead of time compilation for these
>10) It needs to be at the appropriate level.   Its easy to see that there
>are some likely scenarios for this API which will require that all objects
>in the heap are visited.   For instance if you wanted to get a list of the
>objects that had a reference to some other object.    Traversing the heap
>for JVMs is a standard activity.  It doesn't seem that difficult to imagine
>a JVMTI extension or equivalent that could provide a callback mechanism for
>each live object found.   On the other hand we don't want to "just" or
>"even" provide a C level API since that would constrain the JVM and/or the
>JIT options for optimization.

View raw message