nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 程越强 <strongerw...@gmail.com>
Subject RPC timeout
Date Tue, 10 Feb 2009 02:53:34 GMT
HI All
  when i use nutch and hadoop (with pseudo-distributed model) , when i try
to crawl some web site, there is a rpc timeout mistake:
Injector: java.lang.RuntimeException: java.net.SocketTimeoutException: timed
out waiting for rpc response
    at
org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:252)
    at org.apache.hadoop.mapred.JobConf.setInputPath(JobConf.java:155)
    at org.apache.nutch.crawl.Injector.inject(Injector.java:154)
    at org.apache.nutch.crawl.Injector.run(Injector.java:192)
    at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
    at org.apache.nutch.crawl.Injector.main(Injector.java:182)
Caused by: java.net.SocketTimeoutException: timed out waiting for rpc
response
    at org.apache.hadoop.ipc.Client.call(Client.java:473)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:163)
    at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:247)
    at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:105)
    at
org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.initialize(DistributedFileSystem.java:67)
    at
org.apache.hadoop.fs.FilterFileSystem.initialize(FilterFileSystem.java:57)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:160)
    at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:119)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:91)
    at
org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:248)
    ... 5 more

i have use *hadoop dfs -put urls urls *(urls is a directory, which contain
one txt file)*
*and use* hadoop dfs -ls*, i can find the urls on dfs

*hadoop start-all.sh *works well *
*

is there any one know how to solve the problem? thx~
-- 
程越强 Cheng Yueqiang

Mime
View raw message