Hi,
Somebody (Paul?) mentioned using Droids for doing a 50M page crawl. Anyone else
using Droids for crawls of that size?
I'm asking because I have a need to do a "semi-vertical" crawl on up to 10K
domains and I'm considering Droids vs. Nutch. This may translate to several
times that many different servers - say 100K. And that may translate to a few
100M web pages. Too big for Droids without having a persistent link queue,
right?
Thanks,
Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
|