incubator-kato-spec mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Poole <>
Subject Snapshot Design Part II
Date Mon, 25 Jan 2010 15:52:59 GMT
Let's explore the basic outline of this API.

1 - It is a VM side API. That is  - it is active on the running system.

2 - It is declarative.  That means , like SQL,  OQL or similar query
languages it is an API where the user describes what they want to have
happen. There are no call backs into user code for selection purposes.   The
reasoning behind this is:

A) At the point where this collection process is triggered then JVM may be
in poor state - at least where running Java is concerned.

B) A declarative form allows the JVM vendor to implement their solution at
what ever level they choose.

C) Similarly,  larger execution optimisations can be made by the JVM.

D) Having selection criteria written in Java could result in queries
altering that which is being queried. Ignoring the inefficiencies, there is
a risk there could be an infinite loop or deadlock.

3 - It is dynamic - in that the definition can generally be changed up until
the time when the collection is triggered.  It is at least additive in that
applications may wish to register their specific selections to be dumped and
these selections could be mutually exclusive.  In the event of having an  "A
and Not B"  + "B and Not A" situation the API must resolve this into "A and

4 - Multiple instances of the snapshot definition can be created and in
progress at the same time.

5 - Definitions  have a mechanism to allow then to define when they would be
triggered. This would cover particular failure events such as exceptions.

6 -  There would be some concept of a default snapshot that would be
triggered by the JVM on a failing condition such as Out of Memory.

7 - The selection process  that chooses what would be in the dump has to
have at least three component parts

A) A way to define a starting point - this could be a starting object,
class,  thread , class loader or even the heap.

B) A way to define what should be collected and at what level of
representation. By package, by classloader ,  matching super/subclasses,
implements an interface etc.   When reporting an object what gets reported -
all fields, object references (ie some unique id)  , array sizes , array
contents etc?

C) A way to define range and direction.  Consider whats happens if you
wanted to get all objects of a type that were contained in a Map.   At the
API level a Map is a single idea: at the implementation level its a
collection of objects.  When searching for an instance the search needs to
either have an understanding of logical structures or just be constrained to
a number of hops in navigating object relationships.  Maybe both.  Consider
also if you wanted to dump all the threads, their stacks and list the object
references (unique ids) they contain.   That's a different axis to the "walk
the heap" process.

8 - The API should probably cater for the situation where the selection
requirements need to be provided to the JVM on start up.  This may be due to
performance issues or because we identify an entity or situation that can
only be reached during startup.  I don't have an example at this point but I
do want to mention the possibility.

9) Execution time performance of this API is critical  - the design must
offer the implementer the option of ahead of time compilation for these

10) It needs to be at the appropriate level.   Its easy to see that there
are some likely scenarios for this API which will require that all objects
in the heap are visited.   For instance if you wanted to get a list of the
objects that had a reference to some other object.    Traversing the heap
for JVMs is a standard activity.  It doesn't seem that difficult to imagine
a JVMTI extension or equivalent that could provide a callback mechanism for
each live object found.   On the other hand we don't want to "just" or maybe
"even" provide a C level API since that would constrain the JVM and/or the
JIT options for optimization.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message