nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "julien nioche (JIRA)" <>
Subject [jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Date Thu, 13 Nov 2008 09:37:44 GMT


julien nioche commented on NUTCH-442:

In the class should we specify :

around the line 48?

otherwise we might have several attempts for a reduce task running at the same time and sending
the same documents to the same SOLR instance which is likely to slow down the indexing.  SpeculativeExecution
does not make the indexing safer as these attempts would all crash in the same way if they
receive a SOLRException.

> Integrate Solr/Nutch
> --------------------
>                 Key: NUTCH-442
>                 URL:
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer, searcher
>         Environment: Ubuntu linux
>            Reporter: rubdabadub
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>         Attachments: Crawl.patch, Indexer.patch, NUTCH-442_v4.patch, NUTCH-442_v5.patch,
NUTCH-442_v6.patch.txt, NUTCH-442_v7.patch.txt, NUTCH-442_v7a.patch.txt, NUTCH-442_v8.patch,
NUTCH_442_v3.patch, RFC_multiple_search_backends.patch, schema.xml
> Hi:
> After trying out Sami's patch regarding Solr/Nutch. Can be found here (
and I can confirm it worked :-) And that lead me to request the following :
> I would be very very great full if this could be included in nutch 0.9 as I am trying
to eliminate my python based crawler which post documents to solr. As I am in the corporate
enviornment I can't install trunk version in the production enviornment thus I am asking this
to be included in 0.9 release. I hope my wish would be granted.
> I look forward to get some feedback.
> Thank you.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message