nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken van Mulder <>
Subject nutch questions
Date Thu, 08 Dec 2005 23:59:31 GMT
Hey folks,

We're looking at launching a search engine in the beginning of the new 
year that will eventually grow to being a multi-billion page index. 
Three questions:

First, and most important for now, does anyone have any useful numbers 
for what the hardware requirements are to run such an engine? I have 
numbers for how fast I can get the crawler's working. But not for how 
many pages can be served off of each search node and how much processing 
power is required for the indexing, etc.

Second, what all needs to be done to Nutch yet in order for it to be 
able to handle billions of pages? Is there a general list of requirements?

Third, if nutch isn't capable of doing what we need, what is the 
expected upper limit for it? Using the map/reduce version.


Ken van Mulder
Wavefire Technologies Corporation
250.717.0200 (ext 113)

View raw message