nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Wechner <michael.wech...@wyona.com>
Subject difference between intranet and internet crawling
Date Wed, 20 Dec 2006 16:47:59 GMT
Hi

There are are several posts about the difference between

regex-urlfilter.txt crawl-urlfilter.txt

e.g.http://www.mail-archive.com/nutch-user@lucene.apache.org/msg06318.html

or 
http://mail-archives.apache.org/mod_mbox/lucene-nutch-user/200503.mbox/%3c1815d86605033100396d330be@mail.gmail.com%3e

but it might stupid, but  what do you mean by intranet and internet 
crawling?

In the end both of them are just URLs ... right? It seems to me I 
completely misunderstand something.

Thanks for a hint

Michi

-- 
Michael Wechner
Wyona      -   Open Source Content Management   -    Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
michael.wechner@wyona.com                        michi@apache.org
+41 44 272 91 61


Mime
View raw message