nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ned Rockson" <nrock...@stanford.edu>
Subject Strange RemoteException thrown while doing a parse of ~64m documents
Date Wed, 03 Oct 2007 08:11:15 GMT
This is the second time I've run this large parse of ~64m documents.
In the reduce phase, both times through there has been this Exception
thrown.  Has anyone seen this before, or could someone explain what is
going on here? (full stack trace is as follows):
	

org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed to create
file /disks/d0/nutch/mapreduce/system/job_0001/tip_0001_r_000008/task_0001_r_000008_0/data
for DFSClient_task_0001_r_000008_0 on client 208.96.54.73 because
current leaseholder is trying to recreate file.
	at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:669)
	at org.apache.hadoop.dfs.NameNode.create(NameNode.java:283)
	at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:336)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:559)

	at org.apache.hadoop.ipc.Client.call(Client.java:469)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:163)
	at org.apache.hadoop.dfs.$Proxy1.create(Unknown Source)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateNewBlock(DFSClient.java:1119)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:1057)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1283)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.flush(DFSClient.java:1236)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.write(DFSClient.java:1218)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:38)
	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
	at java.io.DataOutputStream.write(DataOutputStream.java:90)
	at org.apache.hadoop.fs.ChecksumFileSystem$FSOutputSummer.write(ChecksumFileSystem.java:395)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:38)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
	at java.io.DataOutputStream.write(DataOutputStream.java:90)
	at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.append(SequenceFile.java:884)
	at org.apache.hadoop.io.MapFile$Writer.append(MapFile.java:162)
	at org.apache.nutch.parse.ParseOutputFormat$1.write(ParseOutputFormat.java:208)
	at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:311)
	at org.apache.nutch.parse.ParseSegment.reduce(ParseSegment.java:117)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:326)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1445)

Mime
View raw message