nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Jelsma (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (NUTCH-2194) Run IndexingFilterChecker as simple Telnet server
Date Wed, 13 Jan 2016 14:33:39 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Markus Jelsma updated NUTCH-2194:
---------------------------------
    Description: 
We have used a customized IndexingFilterChecker running as server to be able to quickly test/check
pages from web applications. I'll add this feature back by letting IndexingFilterChecker run
optionally as a simple server.

Run it with:

{code}
export NUTCH_HEAPSIZE=25 ;  bin/nutch indexchecker -normalize -dumpText -followRedirects -listen
1234
{code}

Then perform a request over TCP:

{code}
echo "http://apache.org/" | nc localhost 1234
{code}

  was:
We have used a customized IndexingFilterChecker running as server to be able to quickly test/check
pages from web applications. I'll add this feature back by letting IndexingFilterChecker run
optionally as a simple server.

Something like:
bin/nutch indexchecker -listen -port <port>

The port param makes only sense if listen is there. Remove -listen or have port as argument
for listen?


> Run IndexingFilterChecker as simple Telnet server
> -------------------------------------------------
>
>                 Key: NUTCH-2194
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2194
>             Project: Nutch
>          Issue Type: New Feature
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>            Priority: Minor
>             Fix For: 1.12
>
>         Attachments: NUTCH-2194.patch
>
>
> We have used a customized IndexingFilterChecker running as server to be able to quickly
test/check pages from web applications. I'll add this feature back by letting IndexingFilterChecker
run optionally as a simple server.
> Run it with:
> {code}
> export NUTCH_HEAPSIZE=25 ;  bin/nutch indexchecker -normalize -dumpText -followRedirects
-listen 1234
> {code}
> Then perform a request over TCP:
> {code}
> echo "http://apache.org/" | nc localhost 1234
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message