Hi guys,
I wanted to use crunch, but when I tried the examples I got
: org.apache.crunch.impl.mr.run.CrunchRuntimeException:
java.io.IOException: File already
exists:file:/tmp/crunch-1094145699/p1/output/_temporary/_attempt_local_0001_r_000000_0/part-r-00000
I am running a git (apache incubator) version of crunch (07/24/2012)
against a 1.0.3 hadoop (maybe this is causing the error,
every dependencies are with 0.20.x hadoop). Or maybe I have messed with my
hadoop configuration (but I can run any hadoop example).
Regards
Gauthier
Stack trace :
714 [Thread-15] INFO org.apache.crunch.impl.mr.run.RTNode - Crunch
exception in 'Text(out)' for input: [(http://www.apache.org/).,1]
org.apache.crunch.impl.mr.run.CrunchRuntimeException: java.io.IOException:
File already
exists:file:/tmp/crunch-1094145699/p1/output/_temporary/_attempt_local_0001_r_000000_0/part-r-00000
at
org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:44)
at org.apache.crunch.MapFn.process(MapFn.java:34)
at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:85)
at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:43)
at org.apache.crunch.MapFn.process(MapFn.java:34)
at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:85)
at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:43)
at
org.apache.crunch.CombineFn$AggregatorCombineFn.process(CombineFn.java:87)
at
org.apache.crunch.CombineFn$AggregatorCombineFn.process(CombineFn.java:72)
at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:85)
at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:43)
at org.apache.crunch.MapFn.process(MapFn.java:34)
at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:85)
at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:100)
at org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:61)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
Caused by: java.io.IOException: File already
exists:file:/tmp/crunch-1094145699/p1/output/_temporary/_attempt_local_0001_r_000000_0/part-r-00000
at
org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:228)
at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:484)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:465)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:372)
at
org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:128)
at
org.apache.crunch.hadoop.mapreduce.lib.output.CrunchMultipleOutputs.getRecordWriter(CrunchMultipleOutputs.java:416)
at
org.apache.crunch.hadoop.mapreduce.lib.output.CrunchMultipleOutputs.write(CrunchMultipleOutputs.java:378)
at
org.apache.crunch.hadoop.mapreduce.lib.output.CrunchMultipleOutputs.write(CrunchMultipleOutputs.java:356)
at
org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:42)
|