james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noel J. Bergman" <n...@devtech.com>
Subject RE: new InputStream class for mail data
Date Mon, 14 Jul 2003 20:02:12 GMT
> The changes in behavior are arguable.

Not really.  The RFC is clear enough.

RFC 2821, section 3.3:

   SMTP indicates the end of the mail data by sending a
   line containing only a "." (period or full stop).

> I argue that the right end of data indicator to recognize
> is "a period alone in a line" rather than "CRLF.CRLF", but
> it seems that many people see it differently.

No "argument" of this nature is necessary.  "A period alone in a line" *IS*
<CRLF>.<CRLF>.  They are identical, by definition.  RFC 2821, section
4.1.1.4, states:

   The mail data is terminated by a line containing only a period, that
   is, the character sequence "<CRLF>.<CRLF>" (see section 4.5.2).  This
   is the end of mail data indication.  Note that the first <CRLF> of
   this terminating sequence is also the <CRLF> that ends the final line
   of the data (message text) or, if there was no data, ends the DATA
   command itself.  An extra <CRLF> MUST NOT be added, as that would
   cause an empty line to be added to the message.  The only exception
   to this rule would arise if the message body were passed to the
   originating SMTP-sender with a final "line" that did not end in
   <CRLF>; in that case, the originating SMTP system MUST either reject
   the message as invalid or add <CRLF> in order to have the receiving
   SMTP server recognize the "end of data" condition.

Ironically, you made note of those two specific sections, but you found
ambiguity in your reading.  There is no ambiguity involved.  There IS a
<CRLF>.<CRLF> in all cases.  The only "trick" is realizing that the first
<CRLF> is the one that terminated the DATA command or line of data.  The
only EXTRA data is the .<CRLF>, but it must be preceded by a <CRLF> in valid
SMTP messages.  Lines are separated by <CRLF>, therefore in order to be
alone on a line, you must be contiguous with <CRLF> on either side.

--------------------

Now, as for the code, itself.

As I said to Serge, I hadn't had time to test your code.  Also, I'm not
quite sure what goal you are trying to achieve with the change.  Would you
please elaborate?

You wrote that "The code we are using now employs buffer after buffer, and I
suspect that this redundant buffering may be unnecessary", but the only
buffers that I am finding present in the SMTP handler at the moment (I could
have missed something) are the BufferedInputStream assigned to "in", and the
line buffer in CRLFTerminatedReader.  The rest of the streams are
unbuffered, and just add behavior.  One is a FilteredInputStream subclass,
and the rest probably should be, including yours.

There USED to be a problem with redundant buffering.  Serge thought that he
had a solution to it around the New Year, but it didn't work, so we reverted
it as there are higher priorities to change in the code.  No one went back
to find out what was wrong, but you ended up fixing it quite neatly with
your CRLFTerminatedReader class.

So the only two buffers are (1) used to provide efficiency into the protocol
stack, and (2) used handle line accumulation.  The DATA command is processed
as a stream, and does not use the line buffer.

By the way, I was surprised to find this in your code:

   /* We have received the sequence
      PERIOD CR
      at the beginning of a line, but it is followed
      by something other than LF .
      So this is an unusual case of dot stuffing.
      We return CR, and buffer b to return on the next
      call.
    */

The RFC states that <CR> MUST NOT appear except paired with <LF>.  You know
this because we addressed that in CRLFTerminatedReader.

The primary reasons I have to change the I/O handling is to support nio.
There is some value to reducing the method chaining, but there is a tradeoff
regarding method complexity.  If you can point out where I am missing
redundant buffers, that's fine.  I'm all for eliminating any redundancy, but
right now the only redundant data I see is the accumulated data in the line
buffer.  Unless a solution addressed the nio issue, or I'm missing the key
point, I don't see much of a reason to change.

Obviously you had a reason for going to all this trouble, so what am I
missing?

	--- Noel


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Mime
View raw message