commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Brosius <>
Subject Re: Proposed Contribution to Apache Commons,
Date Sat, 24 Oct 2015 20:27:29 GMT
Dear My. Shapiro,


Thanks for wanting to share this codebase, and making it available at 

I have attempted to cleanup the repository to make it more approachable 
for others who want to take a look, including reorganizing the src tree 
and adding a proper maven build system. These things would make it 
easier to consume.

If you wouldn't mind going to

and looking at the pull request, and if acceptable, pushing the merge 
button, that would be great.

Thanks again for you source contributions,


On 10/24/2015 11:14 AM, wrote:
> My colleague, Jeff Rothenberg, and I are retired computer scientists and are
> no strangers to regular expression theory and practice. Both of us have used
> regular expressions for decades and have taught many other programmers how to
> use them. Stephen Kleene (,
> the inventor of regular expressions and I
> ( were both doctoral students of
> Alonzo Church ( Rothenberg used
> SNOBOL3 and SNOBOL4 (more powerful than all but a few of the most recent
> versions of regular expressions) extensively in his graduate work in
> Artificial Intelligence in the late 1960 and early 1970s.
> In our experience, although skilled programmers can write regular expressions
> that solve a wide range of problems, for all but the simplest tasks regular
> expressions quickly become "write only". That is, once they have aged for a
> while, no one other than their authors (and, in our experience, often not even
> they) can understand them well enough to verify, modify, debug, or maintain
> them without considerable effort. Analogous low-level programming formalisms,
> such as machine code and assembly language, have been replaced by
> higher-level, more readable and modular languages to produce programs that
> have proven easier and more cost-effective to debug, verify, maintain, reuse,
> and extend.
> In a similar fashion, Naomi is a means of "taming" complex regular
> expressions, as well as offering an easier alternative for those who are
> unfamiliar with them. Naomi makes pattern matching programs more readable,
> modular, and therefore verifiable, maintainable, and extensible. Naomi
> ultimately generates regular expressions, and it can do everything they can
> do, but it provides a higher-level API that uses object-oriented constructs to
> define complex, modular, parameterized patterns and subpatterns.
> Naomi's advantages over bare regular expressions become apparent only for
> larger scale pattern matching tasks. Whereas regular expressions are highly
> compact and terse, this virtue becomes a vice for complex patterns. Coupled
> with the extensive use of metacharacters and escape sequences, this makes even
> moderately complex regular expressions effectively unreadable for all but the
> most experienced and practiced regular expression programmers. Newer features
> that go beyond the original regular expression formalism--such as namable
> groups, built-in names for common character classes, comments, and free white
> space--make regular expressions less terse. But their use is not enough to
> render complex regular expressions easily readable. These extensions are
> analogous to replacing binary machine language by assembly language coding. It
> is only necessary to consider a complex problem--such as that of parsing the
> e-mail date-time specification of RFC 2822 in src/ appreciate
> the obscurity of regular expressions and to understand Naomi's advantages.
>      Norman Shapiro
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message