aries-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guillaume Nodet <>
Subject Re: Do we really need Quiesce support?
Date Fri, 13 Feb 2015 09:43:38 GMT
2015-02-13 9:04 GMT+01:00 Christian Schneider <>:

> On 12.02.2015 21:41, Guillaume Nodet wrote:
>> Quiescing a single bundle does not make sense I think. Quiescing
>> obviously means you want to stop a bunch of bundles. In this case, if you
>> want to actually stop bundle A, you also need to stop all bundles using
>> services from A, so that would include B.
> Why would I want to stop the other bundles? The situation that a service
> is going away is a normal thing in OSGi and the frameworks cover this
> situation.
> There are two cases:
> 1. Dynamic systems like DS
> All components using Services from A in a mandatory fashion will stop
> automatically. So no problem here.
> 2. Systems using damping like blueprint
> Blueprint will block calls to Services offered by A if there is no
> alternative service. This also seems to be fine for me. The calls will time
> out after some time and the callers will react according to their error
> handling functionality.
> So I do not see the need for anything better than what OSGi already offers
> for this use case.

I think you did not understand the concept of *clean* shutdown which I'm
trying to explain.
I think there's a big difference between reacting to the fact that the
database connection has been lost, which is out of your control and the
only thing you can do is stop your own services asap, and cleanly shut down
your application.

Let's tale again my example where we have bundle B which is a service
exposed to the outside world and bundle A, a JPA layer.  If I want to
update bundle A and I simply call bundleA.update(), this will cause users
of bundleB to have their calls fail.  What we really need is the following:
  * stop any new users to use bundle B
  * wait until all calls from bundle B to finish
  * stop bundles A and B (which are not actively used anymore)
  * update bundle A

>  This means that in order to stop A, I think you want to quiesce A and B.
>> One problem with blueprint, and it may be an implementation problem, is the
>> following. I just made a test with the blueprint-testquiescebundle which
>> exposes a simple bean with a sleep method through blueprint. If I call it
>> with a long sleep, and I concurrently stop the bundle, the bean will
>> continue executing while the bundle is already stopped. That's quite bad
>> actually.
> I think it is absolutely fine that threads of a bundle that stops or is
> even uninstalled continue to run. We might want to interrupt them so a
> sleep returns earlier but even if the sleep continues I do not see an
> immediate problem. It would just prevent the classloader from cleaning up
> the bundle classses for some time. Btw. I think having such a long sleep is
> an implementation problem that OSGi does not need to fix.

Uh, no that's clearly not fine. BundleActivator#stop() says the following:

 * Called when this bundle is stopped so the Framework can perform the
 * bundle-specific activities necessary to stop the bundle. In general, this
 * method should undo the work that the {@code BundleActivator.start}
 * method started. There should be no active threads that were started by
 * this bundle when this bundle returns. A stopped bundle must not call any
 * Framework objects.
 * <p>
 * This method must complete and return to its caller in a timely manner.

  So, we can infer the following things:
  * you're not allowed to wait for quite a long time to cleanly wait for
things to be ready to be stopped
  * you're not allowed to leave threads running

This leads to the conclusion that the OSGi api is not sufficient to cover
clean shutdown of bundles.  I think the quiesce api aims to solve this
problem.  I'm not saying it can't be improved or implemented differently,
I'm just trying to expose what I think is the purpose of this api.

>  I think the quiesce participant for blueprint aims to work around this
>> problem by making sure all calls to a service exposed using blueprint will
>> cause a delay in stopping the bundle. This can be worked around by adding a
>> destroy method to the bundle and synchronise all public methods. It could
>> also be done in a smarter way by adding a read/write lock instead. Anyway,
>> I'm not sure if that's the real problem, but the blueprint quiesce
>> participant actually does that.
> Thanks for describing this. I think the main problem with Quiesce is that
> almost no one understands what problems it solves and if the current
> solution actually solves these problems.

I agree, I'm not sure to fully understand it.  That does not mean we should
get rid of it just because we don't understand it ;-)

> So I think what we need is to take step back and describe what actual
> problems would occur without Quiesce and what Quiesce does to solve them.
> Only then we will be able to maintain the code and to improve it.
> The current status is that we have a proprietary API with no documentation
> anywhere and very little understanding by at least the active devs. So I
> hope some people from IBM can step up and help us to better understand this.
> I am pretty sure the current situation will lead to very bad quality of
> the code around Quiesce in mid term as people will either avoid to modify
> this code at all or the modifications will be wrong because of bad
> understanding of the design and the issues to solve.

Well, the first thing to do is to try understanding it, that's what I'm
trying to actually do.

> Christian
> --
> Christian Schneider
> Open Source Architect

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message