nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fjodor Vershinin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-1774) Crawling from REST API giving NullPointerException
Date Tue, 13 May 2014 12:26:15 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996316#comment-13996316
] 

Fjodor Vershinin commented on NUTCH-1774:
-----------------------------------------

I think, the best option is apply this patch after huge API refactoring will be reviewed and
applied. [~lewismc], what do you think?

> Crawling from REST API giving NullPointerException
> --------------------------------------------------
>
>                 Key: NUTCH-1774
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1774
>             Project: Nutch
>          Issue Type: Bug
>          Components: REST_api
>    Affects Versions: 2.2.1
>            Reporter: sreemanth pulagam
>         Attachments: NUTCH-1774.patch
>
>
> Crawling is not working from REST API.
> Steps to reproduce.
> -----------------------
> 1. Start the Nutch server (port 9000).
> 2. Submit the PUT request , to create/initiate crawl job.
>    eg: 
>            URL: http://localhost:9000/nutch/jobs  
>            HTTP METHOD: PUT
>            Content: 
>                 {
>                    "crawl":"123",
>                    "type":"crawl",
>                    "conf":"default",
>                    "args":{
>                       "class":"org.apache.nutch.crawl.Crawler",
>                       "seed":"http://www.somesite.com",
>                       "seedDir":"runtime/local/url/url.txt",
>                       "depth":2
>                    }
>                 }
> 3. Getting the following exception in Generator phase. 
> 2014-05-13 11:37:57,863 WARN  mapred.LocalJobRunner (LocalJobRunner.java:run(435)) -
job_local1326997137_0002
> java.lang.NullPointerException
> 	at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
> 	at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message