nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ledio Ago" <l...@looksmart.net>
Subject RE: [Nutch-dev] distributed search
Date Tue, 20 Dec 2005 01:25:22 GMT
Rafi,

Based on what you're saying, this tool splits a fetchlist into several fetchlists
so that we can crawl/fetch the URLs from different fetchers, right??

If so, that's is not what I'm after.  I'm trying to split an existing index
into smaller partitions, so that I can make those partinions searchable from
multiple nutch serchers, distributed search.

Thanks,

Ledio

-----Original Message-----
From: Rafi Iz [mailto:rafi_dev@hotmail.com]
Sent: Monday, December 19, 2005 4:49 PM
To: nutch-dev@lucene.apache.org
Subject: Re: [Nutch-dev] distributed search



check the next command
FetchListTool (-local | -ndfs <namenode:port>) <db>  <segment_dir> 
[-refetchonly] [-topN N] [-cutoff cutoffscore] [-numFetchers numFetchers] 
[-adddays numDays]

This command call to a function called emitMultipleLists which spit out 
several fetchlists, so that you can fetch across several machines.

e.g.
bin/nutch org.apache.nutch.tools.FetchListTool ......

Rafi


>From: Stefan Groschupf <sg@media-style.com>
>Reply-To: nutch-dev@lucene.apache.org
>To: nutch-dev@lucene.apache.org
>Subject: Re: [Nutch-dev] distributed search
>Date: Tue, 20 Dec 2005 00:38:22 +0100
>
>>By the way, is there an easy way to split the index I have already  have.
>>I would hate to recrawl all of the 1.9MM URLs again and waste  bandwidth.
>
>Well I do not know any tool that comes with nutch or a other tool  that 
>does it, may there is one.
>But to write a java class that creates two smaller indexes from one  large 
>is very easy, a hour work maximum.
>Just check any of the existing lucene tutorial, lucene java doc or  the 
>lucene book.
>BTW, Erik Hatcher's book "Lucene in action" is a MUST for all nutch  users. 
>:-)
>
>Stefan
>

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


Mime
View raw message