james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sid Stuart <...@weaselworks.com>
Subject Re: Processor naming thoughts?
Date Wed, 16 Apr 2003 19:37:43 GMT
alan.gerhard wrote:

>However, if you feel the overall construct of the DTD needs
>improvement, please, make suggestions to a new and more
>effective structure.
Hi Alan,

You may regret writing the above. :)

The goal in modifying the james-config.xml syntax would be to make the 
file simpler to understand for it's readers. The readers are two 
separate audiences: a non-Java-cognizant administrator who is setting 
James up, and a Java developer who is working with James. If one desires 
James to be popular, then one should place the higher priority on the 
first audience as there are (unfortunately) many more people in the 
world who don't know Java than do.

To appease the first audience, the syntax of the XML file should reflect 
the semantics of the defined operations as closely as possible. To 
please the second audience, the elements of the XML file should reflect 
the classes of the implementation as closely as possible.

(Here is where I get into dangerous territory, please feel free to 
correct all the errors in my assumptions.)

What semantics does the "processor" element define? There are two parts 
to the semantics, first the control flow of messages:

        If a-specified-matcher-operation-is-true then
        do one-of-three-things

            1. Pass the message to a mailet for processing
            2. Pass the message to a mailet for processing and then jump 
to another processor.
            3. Jump to another processor.

Second, the structure of the processor tree,

        There must be a root processor.
        There must be an error processor.
        There must be a ghost processor.
        There should be a trash processor. (To replace the null mailet 
class for non-programmers.)
        There may be generic processors.

The first pass at syntax to reflect the above semantics is given in the 
following partial DTD. (Throw away your preconceptions and hold on to 
your seats.)  An annotated version of the DTD is provided below the 
partial DTD. Below the annotated version (entering the third level of 
hell), a partial translation of the james-config.xml file is provided as 
an example.
A partial DTD:

<!ELEMENT messageProcessor (rootTaskList, errorTaskList, taskList*, 
ghostTaskList, trashTaskList)>
<!ELEMENT rootTaskList (matcher+)>
    <!ATTLIST rootTaskList name #FIXED "root">
<!ELEMENT errorTaskList (matcher+)>
    <!ATLIST errorTaskList name #FIXED "error">
<!ELEMENT taskList (matcher+)>
    <!ATLIST taskList name #REQUIRED>
<!ELEMENT ghostTaskList >
    <!ATLIST ghostTaskList name #FIXED "error">
<!ELEMENT trashTaskList  >
    <!ATLIST trashTaskList name #FIXED "trash">

<!ELEMENT matcher (parameter*, mailet?) >
    <!ATLIST matcher   class     CDATA #REQUIRED
                                     transferTo CDATA >

<!ELEMENT mailet (parameter*) >
    <!ATLIST mailet class CDATA #REQUIRED>

<!ELEMENT parameter>
    <!ATLIST parameter name  CDATA  #REQUIRED
                                      value  NMTOKEN   >
Annotated version of the partial DTD:

--------- Message Processor

<!ELEMENT messageProcessor (rootTaskList, errorTaskList, taskList *, 
ghostTaskList, trashTaskList)>

There is one message processor per james-config.xml file. It defines all 
the steps taken to process each message. The messageProcessor element is 
introduced to enforce the tree structure semantics discussed in the 
documentation. The XML definition syntax above says the messageProcessor 
element must contain one rootTaskList, one errorTaskList, may contain 0 
or more taskList elements (that is signified by the asterisk at the end 
of taskList), one ghostTaskList and one trashTaskList. There are no 
attributes for the messageProcessing element.

A sample of the syntax looks like,

    <rootTaskList> ... </rootTaskList>
    <errorTaskList> ... </errorTaskList>
    <taskList name="spam" > ... </taskList>
    <ghostTaskList />
    <trashTaskList />

---------- rootTaskList

 <!ELEMENT rootTaskList (matcher+)>
    <!ATTLIST rootTaskList name #FIXED "root">

The *taskList elements replace the "processor" element in the old 
syntax. The rootTaskList element is the same as a taskList element 
except that it's "name" attribute is fixed to be "root". The 
rootTaskList element must have at least one matcher element and may have 
more (signified by the + at the end of matcher). It has one attribute,  
"name", and that is fixed to be "root". The errorTaskList element is the 
same thing with a different fixed name attribute, so I will skip that.

---------- taskList

<!ELEMENT taskList (matcher+)>
    <!ATLIST taskList name #REQUIRED>

The taskList element is similar to the other rootTaskList and 
errorTaskList elements except that it's name attribute is not fixed, but 
it is required. It is used to define lists of message processing tasks.

--------- ghostTaskList
<!ELEMENT ghostTaskList >
    <!ATLIST ghostTaskList name #FIXED "ghost">

The ghostTaskList contains no elements and is incredibly boring (so 
boring it was left out of the original definition.) I think it should be 
put in for consistency. In a real file it would look like this and would 
be put at the bottom of the messageProcessor element where no one would 
notice it,


Same for the trashTaskList,


-------- trashTaskList

The trashTaskList provides a non-programmer friendly way of specifying 
that a message should be discarded. For example,

<matcher class="MatchSender" transferTo = "trash" >
    <parameter name="senderName" value="sid">

------------- matcher

<!ELEMENT matcher (parameter*, mailet?) >
    <!ATLIST matcher   class     CDATA #REQUIRED
                                     transferTo CDATA >

The matcher element is where the new syntax starts to cover the old 
semantics. It has a required attribute, class, which specifies which 
matcher class to use. It also has an optional attribute, transferTo, 
which specifies that the processing of the matched message will be 
transferred another taskList. It may contain 0 or more parameter 
elements, whose attributes will be passed to the matcher class. It may 
contain 0 or 1 mailet elements (Though I will argue that multiple 
mailets should be allowed. Allowing multiple mailets to be assigned to 
each matcher allows each mailet to tackle one simple task. The only 
problem I see is that it breaks the matcher/mailet pair paradigm.) If 
the transferTo attribute exists, the message will be processed by the 
mailet element before the transfer to the new taskList is made.

A sample of the syntax looks like,

    <matcher class="RelayLimit"  transferTo=trash >
          <parameter name="limit"  value="30" />

--------------- mailet

<!ELEMENT mailet (parameter*) >
    <!ATLIST mailet class CDATA #REQUIRED>

The mailet element may have one or more parameter elements. It also has 
one required attribute, class, the name of the class needed to process 
the message.

A sample of the syntax looks like,

     <mailet class="ToRepository">
             <parameter name="repositoryPath" 
value="file://var/mail/error/" />

----------- parameter

The parameter element contains no elements and has two attributes, the 
name of the parameter and the value associated with the name. The name 
is required, the value is not. It is used to pass name/value pairs to 
the mailet class. The parameter element removes the need for specialized 
elements like the notify, delayTime or maxRetries elements.

<!ELEMENT parameter>
    <!ATLIST parameter name  CDATA  #REQUIRED
                                      value  NMTOKEN   >

Example implementation:
This is a parital rewrite of the processor section of the 
james-config.xml file.


       <matcher class="RelayLimit"  transferTo=trash >
          <parameter name="limit"  value="30" />

        <matcher class="InSpammerBlackList" transferTo="spam" >
          <parameter name="URL"  value="blackholes.mail-abuse.org" />
          <parameter name="URL" value="dialups.mail-abuse.org" />
          <parameter name="URL" value="relays.mail-abuse.org />
          <mailet class="AddErrorMesage" >
             <parameter name="message" value="Rejected - see  

       <matcher class="All" transferTo = "transport />


       <matcher class="All">
          <mailet class="ToRepository">
             <parameter name="repositoryPath" 
value="file://var/mail/error/" />



To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org

View raw message