tomcat-taglibs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miles Kurosky" <>
Subject Scrape taglib caching
Date Sat, 28 Dec 2002 19:55:39 GMT
The Scrape tag seems to need reloading at least 3 times (regardless of the 
time attribute's value or status) to do a fresh scrape. I tried modifying 
the tag to have a default minimum time of one minute vs. the 10 minute it 
arrives with.

Regardless of the time attribute's value, the taglib always needs to be 
reloaded several times in order for a fresh scrape to occur
(unless the page has been modified). Although, with my modified version of 
the taglib, it's several reloads after a minute has passed rather than after 
10 mins have passed.

The behavior as it's outlined in the doc* seems to imply this in a 
roundabout way  -that is, really, the time attribute seems not to be taken 
into account --at all--, and only an expired header or a modification calls 
a rescrape. am I reading the chain correctly?

The Scrape tag's 10 minute minimum default seems a bit long. The ability to 
scrape every second, or minute is useful (I'm using it to pull in current 
weather). The cached behavior is strange too. I've looked a the sources 
PageTag and PageData but can't figure out why the page needs to be evaluated 
three times (or so) before calling a rescrape (and no it's not my browser or 
container), can anyone point me to the block(s) I should modify to make it 
rescrape every second, and not be cached?

*scrape doc

1>The status of the scrape tags and attributes in the JSP is examined. Any 
modifications to the tags or attributes trigger a rescrape. If the tags have 
not been modified, the JSP proceeds to step 2.

2> The minimum time for rescraping, specified by the time attribute of the 
page tag, is examined. The default time is 10 minutes. If this time has not 
passed since the last scrape, cached results are returned. If this time has 
passed, the JSP proceeds to step 3.

3> The expired header of the scraped document is examined. If the expiration 
date/time has not passed, cached results are returned. If the expiration 
date/time is not specified or the document has expired, the JSP proceeds to 
step 4.

4> The headers for the scraped document are requested and examined. If the 
document has not been modified since the last scrape, cached results are 
returned. If the document has been modified, it is rescraped and the new 
results are returned.

STOP MORE SPAM with the new MSN 8 and get 3 months FREE*.

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message