nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sujen Shah <sujen1...@gmail.com>
Subject Re: trouble using nutch server
Date Thu, 09 Apr 2015 01:41:45 GMT
Hi
After a brief look at the source code, it seems you would have to use the
following:

localhost:8081/job/create
{
    "crawlId":"crawl-01",
    "type":"INJECT",
    "confId":"default",
    "args": {"seedDir":"/.../apache-nutch-2.3/runtime/local/url/"}
}

I do not know about the documentation of 2.x

Sorry for the late reply. Hope it helps :)

Regards,
Sujen Shah
M.S - Computer Science (Class of 2016)
University of Southern California
+1(213)-820-9169
http://www.linkedin.com/in/sujenshah

On Tue, Apr 7, 2015 at 4:53 PM, Mahmoud Gzawi <gzawi.mhmd@gmail.com> wrote:

>  Hi. Thanks for the reply.
>
> Yes. I refered to the documentation of nutch 1.X rest api:
> https://wiki.apache.org/nutch/Nutch_1.X_RESTAPI.
>
> I thought it would be similar. I already had a look to the documentation
> of nutch 2.X api, but it's incomplete:
> https://wiki.apache.org/nutch/NutchRESTAPI.
>
> Can you please tell me what is the right settings to use for Inject job in
> nutch 2. Is there any documentation for the other nutch jobs.
>
>
>
>
> On 08/04/2015 01:16, Sujen Shah wrote:
>
> Hi,
> It seems like you are using Nutch 2.x.
> And the args you passed looks like the one from the documentation of the
> Nutch 1.x REST service.
> Could you please tell which documentation did you refer to ?
>
>   Regards,
> Sujen Shah
> M.S - Computer Science (Class of 2016)
> University of Southern California
> +1(213)-820-9169
> http://www.linkedin.com/in/sujenshah
>
> On Tue, Apr 7, 2015 at 3:58 PM, Mahmoud Gzawi <gzawi.mhmd@gmail.com>
> wrote:
>
>> Hi everyone.
>>
>> I'm having a trouble creating a job in nutch server. if any one could
>> help!
>>
>> I'm trying to create a job in nutch server and i'm stuck at the begining:
>>
>> localhost:8081/job/create
>> {
>>     "crawlId":"crawl-01",
>>     "type":"INJECT",
>>     "confId":"default",
>>     "args": {"crawldb":"crawl",
>> "url_dir":"/.../apache-nutch-2.3/runtime/local/url/"}
>> }
>>
>> and here's hadoop log
>>
>> 2015-04-08 00:37:52,102 INFO  api.NutchServer - Starting NutchServer on
>> port: 8081 with logging level: INFO ...
>> 2015-04-08 00:37:52,137 INFO  api.NutchServer - Started NutchServer on
>> port 8081
>> 2015-04-08 00:38:31,384 ERROR impl.JobWorker - Cannot run job worker!
>> java.lang.NullPointerException
>>     at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:207)
>>     at org.apache.nutch.api.impl.JobWorker.run(JobWorker.java:64)
>>     at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>     at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>     at java.lang.Thread.run(Thread.java:745)
>>
>> Can anyone tell me what i'm doing wrong!
>> Thanks in advance.
>>
>
>
>

Mime
View raw message