manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hitoshi Ozawa <>
Subject Re: Export crawled URLs
Date Mon, 05 Dec 2011 00:50:15 GMT
Is "history" just entries in the "repohistory" table with entitityid =


(2011/12/03 1:43), Karl Wright wrote:
> The best place to get this from is the simple history.  A command-line
> utility to dump this information to a text file should be possible
> with the currently available interface primitives.  If that is how you
> want to go, you will need to run ManifoldCF in multiprocess mode.
> Alternatively you might want to request the info from the API, but
> that's problematic because nobody has implemented report support in
> the API as of now.
> A final alternative is to get this from the log.  There is an [INFO]
> level line from the web connector for every fetch, I seem to recall,
> and you might be able to use that.
> Thanks,
> Karl
> On Fri, Dec 2, 2011 at 11:18 AM, M Kelleher<>  wrote:
>> Is it possible to export / download the list of URLs visited during a crawl job?
>> Sent from my iPad

View raw message