[ https://issues.apache.org/jira/browse/NUTCH-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000733#comment-14000733
]
Fjodor Vershinin commented on NUTCH-1774:
-----------------------------------------
Ok, let's commit this patch.
> Crawling from REST API giving NullPointerException
> --------------------------------------------------
>
> Key: NUTCH-1774
> URL: https://issues.apache.org/jira/browse/NUTCH-1774
> Project: Nutch
> Issue Type: Bug
> Components: REST_api
> Affects Versions: 2.2.1
> Reporter: sreemanth pulagam
> Fix For: 2.3
>
> Attachments: NUTCH-1774.patch
>
>
> Crawling is not working from REST API.
> Steps to reproduce.
> -----------------------
> 1. Start the Nutch server (port 9000).
> 2. Submit the PUT request , to create/initiate crawl job.
> eg:
> URL: http://localhost:9000/nutch/jobs
> HTTP METHOD: PUT
> Content:
> {
> "crawl":"123",
> "type":"crawl",
> "conf":"default",
> "args":{
> "class":"org.apache.nutch.crawl.Crawler",
> "seed":"http://www.somesite.com",
> "seedDir":"runtime/local/url/url.txt",
> "depth":2
> }
> }
> 3. Getting the following exception in Generator phase.
> 2014-05-13 11:37:57,863 WARN mapred.LocalJobRunner (LocalJobRunner.java:run(435)) -
job_local1326997137_0002
> java.lang.NullPointerException
> at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
> at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
--
This message was sent by Atlassian JIRA
(v6.2#6252)
|