james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Danny Angus" <da...@apache.org>
Subject RE: [PATCH] Removing Scheduler dependency, refactoring service code
Date Fri, 11 Oct 2002 07:20:03 GMT
not quite sure if I get this, cant see how related it is to the scheduler question, perhaps
someone will enlighten me..
but if the discussion is about should we have listener->despatcher->worker architecture
with a pool of workers, then yes from me, if it is should we instatiate new workers with "pre-built"
configurations, then I guess thats better than instantiating new workers from scratch.

I cant see any alternative to listener->despatcher->worker , hence my confusion about
the question..


d.

> -----Original Message-----
> From: Noel J. Bergman [mailto:noel@devtech.com]
> Sent: 10 October 2002 22:01
> To: James Developers List
> Subject: RE: [PATCH] Removing Scheduler dependency, refactoring service
> code
> 
> 
> Nathan,
> 
> Seems to me that A (not THE) model might look something like:
> 
> 	Server
> 		Interacts with underlying server platform.
> 
> 		Parses service configuration information,
> 		and prepares configuration necessary for
> 		handlers.
> 
> 		Responsible for providing a handler to
> 		handle each incoming connection.  This
> 		is a factory function, regardless of
> 		whether the objects are pooled or
> 		constructed per use.  The latter is an
> 		implementation issue, not an interface
> 		issue, although if pooling is used, at
> 		some point you need to provide a means
> 		for returning objects to the pool.
> 
> 		Logs information related to service startup.
> 
> 	ConnectionManager
> 		Listens for connections.
> 
> 		When a new connection arrives, it
> 		interacts with a server to associate
> 		a handler, connection and thread.
> 
> 	Connection
> 		Data transport
> 
> 	Handler
> 		Given a connection (transport), it
> 		implements the semantics of a given
> 		protocol.
> 
> 		Protocol (semantic) logging.
> 
> 		Needs configuration information, but
> 		information that isn't per-connnection
> 		shouldn't be re-processed each time.
> 
> 	Config
> 		This is service specific, but is it
> 		even an object?  Where does the
> 		notion of "config" exist?  Some of
> 		it is part of the contract between
> 		Server and ConnectionManager, e.g.,
> 		number of simultaneous connections.
> 		Some of it is part of the contract
> 		between the Handler and the Handler
> 		Factory.  That could be implemented
> 		with a service specific object, or
> 		implemented with bean properties.
> 
> 	Watchdog
> 		Lightweight timeout.
> 
> This is just off-the-cuff.  Reaction?
> 
> 	--- Noel
> 
> -----Original Message-----
> From: Nathan Carter [mailto:nathancarter@gmx.net]
> Sent: Thursday, October 10, 2002 13:50
> To: James Developers List
> Subject: Re: [PATCH] Removing Scheduler dependency, refactoring service
> code
> 
> 
> Peter et al,
> 
> 1) This thread is quite interesting because I've asked many of these
> same questions in beginning to look at the IMAP stuff, which in many
> cases (at least the listener and handler code) seems to be based on old
>   versions of the POP equivalents (i.e., they are missing august and
> september changes by danny and peter to the POP code).
> 
> 2) In your initial post, you mention doing some "basic" and "load"
> testing to further quantify the memory/robustness/scalability issues and
> show how your patch solves them.  What about (and I'd be happy to donate
> time for this):
> 
>    a)using a profiling tool to see how much time is actually being
> wasted by configuring each Handler instead of more tightly coupling them
> to the server listener.
> 
>    b)Do you have before/after metrics for your load tests?  A 2X
> improvement will be a strong argument in itself.  JMeter doesn't seem to
> support email server load testing - is there anything better than this
> (e.g. PushToTest,
> http://www.etestinglabs.com/benchmarks/svrtools/email/t1intro.asp)?
> 
> Again, I'd be more than happy to help in testing because this
> significantly affects what I'm doing in IMAP, and I'm going to have to
> perform this kind of testing on IMAP due to its complexity and higher
> overhead.
> 
> Nathan Carter
> 
> 
> Peter M. Goldstein wrote:
> > Harmeet et al,
> >
> >
> >>Because the scheduler is the wrong abstraction.  This is not a
> >
> > scheduler
> >
> >>problem - it's a timeout.  A timeout is far simpler than a scheduler.
> >>And that's the core issue.
> >>
> >>
> >>[Harmeet]
> >>One time timer task rather than repeated timer tasks is the same thing
> >
> > as
> >
> >>Watchdog timeout. It is property of implementation.
> >>
> >
> >
> > See Noel's comments on this.
> >
> > I also want to add that this situation is directly analogous to the
> > AuthService issue.  The interface itself defines a contract.  In the
> > case of the Scheduler, that contract requires support for multi-threaded
> > lookup of targets on each reset.  This requires lock contention
> > (assuming the implementation properly implements the contract) on each
> > reset.  Lock contention which is both unnecessary, since no handler is
> > interested in the target of any other timeout and potentially expensive
> > (multi-way lock contentions cost ~50x uncontested locks/volatiles in
> > modern JVMs).  The interface contract puts restrictions on the
> > implementation.
> >
> > I'm not going to respond to the rest of the specific scheduler comments,
> > because they all labor under the same confusion of interface contract
> > and implementation.  Noel nailed the crucial point.
> >
> >
> >>[Harmeet]
> >>A Config object can take care of this. Handlers don't need to
> >>expose themselves to lifecycle. What you are describing is
> >
> > configuration
> >
> >>per
> >>handler and as you pointed out that is not needed. You are also saying
> >>configuration is spread out, that can be centralized too in a config
> >>object.
> >>
> >
> >
> > This is not correct.
> >
> > There are a couple of real problems with this proposal, some of them
> > highlighted in Noel's reply, some not.
> >
> > Let's start with the situation.  We have a block definition in Phoenix
> > that represents a service.  This class representing this block
> > (NNTPServer, SMTPServer, etc.) is instantiated by Phoenix (or any other
> > Avalon container) and is passed a read-only Configuration object (see
> > the comments for the Configuration interface).  What happens now?
> >
> > In the current code base the following behavior occurs.  Some parameters
> > (port, isSSL, enabled) are read out of the configuration, validated, and
> > set to defaults if they are not present.  The parameters that are read
> > and validated are logged, the rest are not.  Note that even this
> > functionality requires customized Server classes for each service.
> >
> > A child of the Configuration is then passed to the superclass
> > AbstractService.  There is no clear distinction between the elements
> > that are in the top level Configuration object and its child class.
> > When the first connection (and every subsequent connection) is made, the
> > remainder of the Configuration parameters are read and validated.  This
> > expense is incurred on each connection.  Note that the server may fail
> > because of bad configuration upon first connection, which may be hours
> > or even days after the server was started.
> >
> > Changes to the handler which add additional component dependencies now
> > require alteration to the supposedly independent server configuration,
> > both in the .xinfo and in the assembly.xml.  This is totally non-obvious
> > from the "decoupled" listener/handler model.
> >
> > Now consider the situation with the changed code.  The server starts and
> > all service configuration parameters are read, validated, and set to
> > defaults if they are not present.  Any fatal misconfiguration will be
> > detected here and cause server failure.  All service configuration
> > information is logged, so an administrator can simply look at the
> > startup log and see how the service was configured.  There is no extra
> > per-connection logging to that prints a config parameter that doesn't
> > change over the life of the server.  It's stated just once in the log.
> >
> > Since the handler is viewed as tightly coupled to the listener, and
> > dependency checking is understood to be a function of the listener
> > block, it is obvious that changes to the handler that introduce new
> > dependencies will require modification of the listener configuration
> > files.
> >
> > Personally I much prefer the second scenario.  It's cleaner from both an
> > administrator and a developer point of view.  See below for reasons why
> > the potential third scenario - the suggestion to "cache" configuration
> > information - is badly flawed.
> >
> >
> >>[Harmeet]
> >>Listener-dispactch(handler) is the pattern followed in current code.
> >
> > It is
> >
> >>more flexible than  server class hierarchy. The missing piece is
> >
> > confusing
> >
> >>and repeated configuration of the handler not a problem with the
> >
> > pattern
> >
> >>iteself. Most other servers try to separate the listeners from
> >
> > dispatch
> >
> >>workers/handlers.
> >
> >
> > The current code is a listener-dispatch pattern.  It is a
> > listener-dispatch pattern, and would remain so after the changes I
> > propose.
> >
> > I don't think you quite understand what a listener-dispatch pattern
> > implies.  There is absolutely no requirement in that pattern (and
> > certainly it doesn't hold true in generic implementations) that the
> > listener be generic.  In general, the listener is specific to the
> > particular service being implemented.  Usually there is a common
> > interface that describes the listener-handler (in this case
> > ConnectionHandlerFactory / ConnectionHandler) but there is certainly no
> > requirement (nor is it desirable) that either part of the pattern be
> > generic.  This pattern is all about a division of responsibility, not
> > about the ability to swap in and out different partners.  The division
> > of responsibility is clarified in my proposal, as all the configuration
> > is moved to the listener and the handler can focus on its purpose -
> > describing the logic for a particular transaction type (NNTP, SMTP,
> > etc.).
> >
> > As I stated before, neither side is generic in the current code base.
> > The server or listener side inherits from a reasonably heavyweight
> > AbstractService class - about the same weight as the
> > AbstractJamesService class, but with fewer features.
> >
> >
> >>[Harmeet]
> >>Params/Config need not require revalidation and if they do there will
> >
> > be
> >
> >>the
> >>overhead that you are avoiding by server class hierarchy.
> >>
> >
> >
> > Again, this is not correct.  You need to validate your config
> > parameters.  You need to do it somewhere.  If you load and parse the
> > parameters in the handler, you do it in the handler.  If you load and
> > parse the parameters in the server class, you do it in the server class.
> >
> >
> >>[Harmeet]
> >>Cache the config. Precalculation for handlers, isn't that the only
> >
> > gain in
> >
> >>having a server class hierarchy. Maybe that idea can be abstracted and
> >>used
> >>with current model.
> >>
> >
> >
> > No, this isn't the only gain in having a server class hierarchy.  See
> > this email and the previous one for a discussion of gains.  They
> > include, but aren't limited to, better error handling, clearer logging,
> > simpler configuration, and simpler code.
> >
> > Moreover this caching idea you suggest is even worse than the current
> > code.  Consider the situation.  Since you object to distinguishing
> > between the server objects, presumably you're going to have the "first"
> > handler for the service initialize the configuration and all others will
> > access it.  That's because someone has to initialize the config.  If
> > you've got a generic server with no ability to interpret the content of
> > the configuration, it has to be the handler.  This is a classic server
> > design anti-pattern.  Because what you've done is introduced a mutable
> > (because the handler has to do the initial parsing and setting to
> > defaults), shared configuration object.  Access to this object will
> > require a lock on every call, because the handler has to check if the
> > config is initialized and, if not, initialize it.  This becomes a
> > heavily contested lock.  Very expensive, destroys performance.
> >
> > Now, let's imagine that we admit that the natural place for the
> > configuration parsing is in the server class.  But we still want to
> > stick with this notion of "generic" arguments passed to the handler.
> > What's wrong with this?  The basic problem with it is that you don't
> > gain anything over the approach I suggest and you lose type safety.  The
> > server class still needs to know what type of service it represents in
> > order to parse and validate the configuration parameters.  In addition,
> > if we restrict ourselves to the Avalon configuration rather than some
> > Properties object, we lose even more. Configuration parameters that
> > aren't easily represented as Strings become difficult to pass to the
> > handler.  We have to put some String key in place, and give the handler
> > the Configuration.  It pulls out the String, and looks up the non-String
> > thing.  Ugh.
> >
> > This is simple?
> >
> >
> >>>>This leads to tighter coupling between the
> >>>>Handlers and the corresponding server classes, but that's simply
> >>>>recognition of a pre-existing reality.  These classes are
> >>>
> > extremely
> >
> >>>>tightly coupled.
> >>>
> >>>Server is a listener. Handler contains protocol specific stuff. Why
> >>
> >>would
> >>
> >>>there be a (tight) coupling ?
> >>>It is a listen-dispatch mechanism. Shouldn't it be as loosely
> >>
> > coupled
> >
> >>as
> >>
> >>>possible ?
> >>
> >>This is a somewhat nave view of a service architecture.  Servers
> >
> > aren't
> >
> >>generic.  If they were, we'd need only one server class period.  But
> >>servers are different from one another, so we need separate server
> >>classes.  You can see how the servers differ by examining the diffs
> >
> > for
> >
> >>the assorted server classes that were submitted with this patch.
> >>
> >>
> >>
> >>[Harmeet]
> >>No I am saying listeners are generic. Handlers/Workers are specific. A
> >
> > mix
> >
> >>of two makes service and servers.
> >>
> >
> >
> > Listeners are not generic.  Not even close.  They aren't now, they can't
> > be for the reasons I outlined in my last email, and there's been no
> > compelling reason presented for them to be generic.
> >
> >
> >
> >>[Harmeet]
> >>But a socket listener associated with SMTP handler makes an SMTP
> >
> > Service.
> >
> >>That is the current state.
> >>An incremental and better system would be Socket Listener with Config
> >>Object
> >>and Handler makes a Service.
> >>
> >>A Bad State would be a server class hierarchy that couples Listener,
> >>Config
> >>and Handler.
> >>
> >>-------------------------------
> >
> >
> > I couldn't disagree more with this statement.
> >
> > It's a typical expression of extensibility at any cost.  The trick to
> > writing extensible code is to pick your extension points.  The
> > Configuration object is generic, so the class itself is not coupled to
> > the server or the handler.  Of course the specific values stored in that
> > Configuration object ARE tightly coupled to the SMTP server.  They have
> > to be, as they're the values that configure how the SMTP server runs.
> >
> > The Handler and Listener should be tightly coupled.  There is no reason
> > why an NNTPServer should work with a SMTPHandler.  That wouldn't make
> > any sense.  Even in the proposed solution they wouldn't work together -
> > they'd just fail miserably on each connection because the SMTPHandler
> > wouldn't have the parameters it needed.  The handler and listener are
> > tightly coupled precisely because they are implementations of a service
> > - a service that has configuration, server specific details, and a
> > protocol.
> >
> >
> >>[Harmeet]
> >>Port, queus size etc  are the only listener concerns.
> >>
> >
> >
> > Port, queue, etc.?  What's hidden in the etc.?
> >
> > Trivializing the listener concerns like this is just hiding the issue.
> > The listener is responsible for socket and connection management, so it
> > needs to be able to specify the ConnectionManager and socket type.  As
> > the ConnectionManager needs a thread group to invoke a server
> > connection, it needs a thread group.  For certain services there will be
> > configuration restriction logic related to the socket types (imagine
> > publishing an HTTPS service with TCP/IP sockets).  This all belongs in
> > the listener.   The handler can't deal with any of it, because by the
> > time the handler is invoked this needs to have been already dealt with.
> > So the listener has to have a very non-trivial configuration.
> >
> >
> >>-------------------------------
> >>
> >>Second, there are resources provided by the server that depend on the
> >>configuration.  Yes, it's possible to re-read the configuration on a
> >
> > per
> >
> >>handler invocation basis.  But that's a bad idea for the reasons
> >>described above.
> >>
> >>[Harmeet]
> >>That is why I am suggesting params or configuration object, cached for
> >
> > a
> >
> >>class of handlers.
> >
> >
> > See above why a param (lose type safety) or configuration object (can
> > only hold Stringifiable objects) is a bad idea.
> >
> >
> >>[Harmeet]
> >>Are you saying Avalon is too complicated. Avalon complexity or user
> >>friendliness is not a reason to change a fairly standard
> >
> > listener-dispatch
> >
> >>model. If there are problems with Phoenix, would it be better to not
> >
> > use
> >
> >>it
> >>? That is a different topic. Complexity in one part should ideally not
> >
> > be
> >
> >>reason a rearchitect another part.
> >
> >
> > This is just silly.  What's being critiqued here as over-complex is a
> > vast simplification of one of the most difficult aspects of server
> > design.
> >
> > Let me be clear - this is one of the best parts of Phoenix.  It's
> > incredibly well designed and well thought out and it takes one of the
> > most annoying parts of server design (getting all your dependency
> > checking right) and makes it trivial.
> >
> > Avalon in general, and Phoenix as a specific container is based around
> > the idea of component oriented programming.  Each of the services we
> > define - the listener classes - comprises a component.  It's a component
> > because it can be uniquely defined by a name, provides one or more
> > interfaces, exists for the life of the container, and can be looked up
> > by the other components running in the container.  These characteristics
> > are all necessary to establish a dependency chain and ensure that each
> > component can both declare what it needs to function and be evaluated as
> > part of a dependency chain.
> >
> > The handler instances cannot be components.  There are some arbitrary
> > number of them, they don't have unique names, they can't be accessed by
> > other components, and they have very limited lifetimes.  Thus they don't
> > meet the minimum requirements.
> >
> > Avalon is great, and Phoenix is a great container.  Chucking it is
> > really out of the question.  This is not a fault in the Avalon
> > framework, but rather a fault in your proposal.  Your proposal violates
> > basic principles of component oriented design.
> >
> >
> >
> >>-------------------------------
> >>
> >>[Harmeet]
> >>You can have different services. POP3 and POP3 Secure. Doesn't that
> >
> > achive
> >
> >>what you are describing. If you POP3 is a service, POP3 (plain) and
> >
> > POP3
> >
> >>Secure are 2 instances of it. It depends on what you mean by a
> >
> > service.
> >
> >>Effectiely if you want different listereners/threadgroups for a
> >
> > protocol,
> >
> >>that exists.
> >
> >
> > No, it doesn't.  You are misunderstanding the problem.  The ability to
> > configure different socket types and different thread groups is not
> > present in the current code.  Period.  The configuration doesn't look
> > for it.  All it can do is "plain" sockets vs. "ssl".  That's it.  And
> > the thread pool is always the default thread pool.  There is no way to
> > configure the server to use different thread pools for different
> > services.
> >
> >
> >
> >>-------------------------------
> >>
> >>As far as the ability to handle multiple socket types, that wasn'tb
> >>there.  The socket type was one of two hard coded strings, and the SSL
> >>selection was Boolean.  So if you wanted to have two different SSL
> >>socket types, one for use by the NNTP server and another for use by
> >
> > the
> >
> >>SMTP server (a very common configuration), then you would have to run
> >>two separate servers.  Very bad.
> >>
> >>[Harmeet]
> >>It is pretty standard. Take a look at tomcat. They do the same thing.
> >>
> >
> >
> > No, it isn't.  Almost nobody writes production servers that handle SSL
> > for multiple services without providing the ability to have multiple SSL
> > connection types.
> >
> > This is also a misunderstanding of the standard recommended production
> > deployment of Tomcat.
> >
> > Tomcat is a reference implementation of the Servlet/JSP specs.  As such
> > it was an explicit decision to focus on the Servlet/JSP processing in
> > Tomcat.  The network interface was left uber-simple on purpose.  The
> > static page delivery wasn't tuned.  Why?  Because the developers wanted
> > to focus on the core functionality.
> >
> > And that's why the recommended deployment of Tomcat is in a reverse
> > proxy environment.  Apache in front to handle complex connection
> > management and configuration (and it certainly does allow multiple SSL
> > connection types) and to handle static page delivery.  And then Apache
> > proxies Servlet and JSP requests to the Tomcat server.  That allows the
> > production environment to take advantage of Apache's strong connection
> > management.
> >
> > This is a recognized weakness of the Tomcat implementation.  That's why
> > there is always a lot of discussion about embedding Tomcat in Avalon
> > containers, etc. on the Avalon mailing lists.  Because a container that
> > implements the Avalon framework would provide all of the complex
> > connection management features that the Tomcat implementation lacks and
> > allow you to run more complex systems in non-reverse proxy environments.
> >
> > I don't see any reason to hobble ourselves with an unnecessary flaw.
> >
> >
> >>[Harmeet]
> >>Basically if n lines have been written out, observer resets the timer.
> >>rather than only after all the lines have been written out.
> >>
> >
> >
> > And this won't work.  Why?  Because it ignores the distinction between
> > observer and observable as well as the conditions under which we want a
> > reset.  The purpose of the reset here is to prevent transaction hangs,
> > as much from server processing as from client processing.  So the
> > primary trigger reset is not the amount of data written, but rather the
> > trigger that occurs every time a command is processed.  A command being
> > processed indicates that the transaction is proceeding.  As this
> > information is not available to the stream (it has no idea when a
> > command has completed), the stream cannot be the observable in this
> > situation.  The stream data is crucial for commands that potentially
> > involve large transfers of data, so the handler can't be the observable.
> > So the pattern doesn't fit.
> >
> >
> >
> >>[Harmeet]
> >>The observer-observable is a clean and standard pattern way. It allows
> >
> > an
> >
> >>observer to monitor and decide on timeout reset.
> >>
> >
> >
> > It is a clean and standard pattern.  It's just the wrong one for this
> > situation.  See above.
> >
> > --Peter
> 
> 
> --
> To unsubscribe, e-mail:   
> <mailto:james-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: 
> <mailto:james-dev-help@jakarta.apache.org>
> 


--
To unsubscribe, e-mail:   <mailto:james-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:james-dev-help@jakarta.apache.org>


Mime
View raw message