nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events
Date Tue, 02 Aug 2016 15:26:20 GMT


ASF GitHub Bot commented on NUTCH-2132:

GitHub user sujen1412 opened a pull request:

    Fix for NUTCH-2132: Publisher/Subscriber model for Nutch to emit events

    This PR is still in progress and needs a review to get the plugin system working. It is
not ready to commit as of yet.

You can merge this pull request into a Git repository by running:

    $ git pull NUTCH-2132

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #138
commit 5a13301a7808d852b88d8724c3c3b7783fa9d2be
Author: Sujen Shah <>
Date:   2015-10-02T08:03:35Z

    Added dependency for RabbitMQ

commit 3029ef4055834b72d63e7fc516eca448c2efe32a
Author: Sujen Shah <>
Date:   2015-10-02T08:04:39Z

    Code for FetcherThreadPublisher

commit ebfd7728e650cc7648a3939d3985826eefde93f3
Author: Sujen Shah <>
Date:   2015-10-02T08:05:11Z

    Added property descriptions in nutch-default.xml

commit 445fcc2d766ddef7cc36783ebcaecb552e4f2819
Author: Sujen Shah <>
Date:   2015-10-02T08:28:52Z

    Added support for queue routing key

commit 44498308634dda99f543d5e18724ad8cfeb16343
Author: Sujen Shah <>
Date:   2015-10-19T08:44:29Z

    Added properties to make publisher optional in nutch-default.xml

commit ad88c94fc274576aacaa2c17b1f55a087f7a04f9
Author: Sujen Shah <>
Date:   2015-10-27T02:40:45Z

    Added routingkey support

commit e380de803c8c129f0dfb7d8c31a8596b4ceae8bf
Author: Sujen Shah <>
Date:   2015-10-29T20:12:55Z

    Better exception handling when RMQ server is down

commit e4f5e13cc5675f9e7d37ebda39bf230c08baf4b8
Author: Sujen Shah <>
Date:   2016-08-02T15:01:48Z

    Created plugin system for pub/sub implementation in Nutch

commit 2c484ec4789c84f7bf9e592e15c96cf788ef5967
Author: Sujen Shah <>
Date:   2016-08-02T15:12:28Z

    Removed Rabbitmq dependency from ivy.xml and remove author tags


> Publisher/Subscriber model for Nutch to emit events 
> ----------------------------------------------------
>                 Key: NUTCH-2132
>                 URL:
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher, REST_api
>            Reporter: Sujen Shah
>            Assignee: Chris A. Mattmann
>              Labels: memex
>             Fix For: 1.13
>         Attachments: NUTCH-2132.patch, NUTCH-2132.v2.patch, PubSub_routingkey.patch
> It would be nice to have a Pub/Sub model in Nutch to emit certain events (ex- Fetcher
events like fetch-start, fetch-end, a fetch report which may contain data like outlinks of
the current fetched url, score, etc). 
> A consumer of this functionality could use this data to generate real time visualization
and generate statics of the crawl without having to wait for the fetch round to finish. 
> The REST API could contain an endpoint which would respond with a url to which a client
could subscribe to get the fetcher events. 

This message was sent by Atlassian JIRA

View raw message