jakarta-oro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel F. Savarese" <...@savarese.org>
Subject Re: OROMatcher-1.1 source?
Date Sun, 12 Nov 2000 17:26:48 GMT
>the matching process will be time consuming and just terminate). I looked
>forward to do that with the open source library, but everything connected
>to streamed input has been removed.

The streamed input approach was fundamentally flawed because of Perl's
maximal matching, zero-width assertions, and other characteristics.  See
the following paper for an explanation:
"On the Use of Regular Expressions for Searching Text", Clark and Cormack,
 ACM Transactions on Programming Languages and Systems, Vol 19, No. 3,
 pp 413-426.
The situation is different for the DFA-based AWK package, which is still
deficient because it can result in all of the input being read.  You
may wish to base your code on the AWK package's stream matching, but
keep in mind that the only effective way of doing the stream matching is
to either mandate minimal matching or use the 'expect' approach of defining
some buffer size that slightly exceeds the maximum size of a match you
expect to make.  To be completely correct 100% of the time without any
restrictions, you have to read all of the input, which makes stream
matching pointless (which is why it was removed).

>Therefore, I wonder if it would be possible to get the source code for the
>1.1 version of OROMatcher. We would of course contribute all patches done.

Pre-jakarta source code is not available for various legal reasons :( 


View raw message