nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kirby Bohling <kirby.bohl...@gmail.com>
Subject Re: Plugins: when to perform web service requests, on fetch or on index?
Date Fri, 19 Jun 2009 05:53:46 GMT
On Thu, Jun 18, 2009 at 1:42 PM, caezar<caezaris@gmail.com> wrote:
>
> The main idea not to just store some useless data in index. There are
> searches performed on this data, combined with keywords searches, so I need
> this data in index.

Given what you've said here, I'd look at the "index-more" plugin.  I
followed and the following pages when I added a category, and keywords
to pages (I added synonyms of domain specific terms to help find
additional data, without forcing the user to search and research).  I
followed the index-more plugin to figure out how to add them.  The
"explain" link of the search pages was very helpful to see how that
worked into the scoring.

I'm fairly sure that this is what you need according to what you've said:
http://wiki.apache.org/nutch/HowToMakeCustomSearch

This link was useful:
http://wiki.apache.org/nutch/WritingPluginExample-0.9

This is somewhat helpful:
http://wiki.apache.org/nutch/FAQ?highlight=(scoring)#head-347f304e874bee7ff37f8b1a69f9983103cc3150

Hope this is useful,
    Kirby



>
> joel gump wrote:
>>
>> If only the urls is used, how about save urls to a database.
>> Then, another app check the database, call webservices.
>> Personally , i dont like mix something with crawl/index process.
>>
>
> --
> View this message in context: http://www.nabble.com/Plugins%3A-when-to-perform-web-service-requests%2C-on-fetch-or-on-index--tp24089858p24096246.html
> Sent from the Nutch - Dev mailing list archive at Nabble.com.
>
>

Mime
View raw message