incubator-droids-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Droids suitability for a 100M+ page crawl
Date Fri, 25 Mar 2011 20:08:40 GMT
Hi,

Somebody (Paul?) mentioned using Droids for doing a 50M page crawl.  Anyone else 
using Droids for crawls of that size?

I'm asking because I have a need to do a "semi-vertical" crawl on up to 10K 
domains and I'm considering Droids vs. Nutch.  This may translate to several 
times that many different servers - say 100K.  And that may translate to a few 
100M web pages.  Too big for Droids without having a persistent link queue, 
right?

Thanks,
Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


Mime
View raw message