cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bertrand Delacretaz" <bdelacre...@apache.org>
Subject Re: Crawling over web pages with cocoon (Running a pipeline per page)
Date Tue, 05 Sep 2006 08:18:39 GMT
On 9/4/06, Nils Kaiser <NilsKaiser@gmx.net> wrote:

> ...So the question is, how do I crawl the page
> automatically - or if not possible, what is the best way to achieve a
> similar behavior?...

For a one-time job of converting a collection of webpages, I'd use an
external crawler like wget, and create Cocoon pipelines to do the
format conversion.

You'll need a "table of contents" page which generates (at least
indirect) links to all other pages, and use this page as an entry
point for wget.

You could of course do the whole thing in Cocoon, but it's probably
faster to implement and test with this combination of tools.

-Bertrand

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message