nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Joyce (JIRA)" <>
Subject [jira] [Commented] (NUTCH-1987) Make bin/crawl indexer agnostic
Date Wed, 15 Apr 2015 21:17:59 GMT


Michael Joyce commented on NUTCH-1987:

Hey Sebastian, thanks for the feedback.

I agree the positional argument handling is a bit daft. I was aiming more for a quick intermediate
solution that didn't disrupt too much while getting this functionality in there. I'm happy
to update this patch with a bit nicer handling of arguments or waiting and doing a quick follow-on
patch if this gets merged. Whatever works for everyone is fine with me.

> Make bin/crawl indexer agnostic
> -------------------------------
>                 Key: NUTCH-1987
>                 URL:
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.9
>            Reporter: Michael Joyce
>             Fix For: 1.10
> The crawl script makes it a bit challenging to use an indexer that isn't Solr. For instance,
when I want to use the indexer-elastic plugin I still need to call the crawler script with
a fake Solr URL otherwise it will skip the indexing step all together.
> {code}
> bin/crawl urls/ crawl/ "" 1
> {code}
> It would be nice to keep configuration for the Solr indexer in the conf files (to mirror
the elastic search indexer conf and others) and to make the indexing parameter simply toggle
whether indexing does or doesn't occur instead of also trying to configure the indexer at
the same time.

This message was sent by Atlassian JIRA

View raw message