httpd-test-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Erenkrantz <>
Subject Regular expressions in Flood?
Date Tue, 14 Aug 2001 06:09:46 GMT
[ Maybe some Perl hacker will read this and have thoughts... ]

Anyway, I encountered the following situation when running flood today
and the only way I can think of resolving this is with full blown
regular expressions.

Here's the scenario I have:

We want to extract some information from the response returned by
the server.  So, let's say we want to get an ID back that is embedded
in a URL.  An example:

...blah...<A HREF="" class="bar">Justin's test</A>...blah

So what I currently have in CVS will work like ($ is a really bad
delimiter, but it's what I chose and is easily changeable if I could
come up with something better):

<A HREF="$$" class="bar">Justin's test</A>

And, $$ will now take on the value 123 by some rudimentary pattern

However, that all gets shot to hell when faced with:

<A HREF="$$" tabindex="50" class="bar">Justin's test</A>

Now, the tabindex value is keyed off of its position within the document
(and we can't move the tabindex value around due to limitations in JSP
land).  I definitely don't want to hardcode 50 in the response 
"template" (i.e.  what flood will look for).  So, the alternative seems 
to be bite the bullet and use regex.  So, the above example could be 
coded in regex as:

<A HREF="[^"]*)" ([^>]*)>Justin's test</A>

Is this correct (Roy says so)?  Then, $1 (variable one in the regex) is 
123 in my example.  $2 is the rest of the junk I don't care much about.

This also leads to a problem with how do I tell flood that I want to
retrieve $1 and place it in my "state" table?  I don't know exactly how
to do that.  I'm just thinking to hardcode $1 as what it should grab.
Maybe I could add a responsetemplatevalue in XML which says, "Use
this number parameter from the regex and store its value in your state 
table."  Is there some common semantic for doing this?

Also, does anyone know anything about the POSIX regex functions (in 
regex.h)?  Is there a reason to use PCRE even when the POSIX regex 
functions are available?  I've coded up a quick proof-of-concept using
the POSIX regex functions, but I'm not sure why httpd doesn't use the
POSIX library (unless it isn't very common).  I haven't come across a
system that didn't have POSIX regex, but I'll bet there is one.
However, both of the "target" platforms (Solaris and Linux) both have
the POSIX regex libraries.  So, I'm tempted not to use PCRE unless 
there is a good reason to.  -- justin

View raw message