nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Susam Pal" <susam....@gmail.com>
Subject Re: nutch latest build - inject operation failing
Date Thu, 14 Feb 2008 16:37:38 GMT
What I did try was setting hadoop.tmp.dir to /opt/tmp. I found the
behavior strange. I had an /opt/tmp directory in my Cygwin
installation (Absolute Windows path: D:\Cygwin\opt\tmp) and I was
expecting Hadoop to use it. However, it created a new D:\opt\tmp and
wrote the temp files there. Of course this failed with the same error.

Right now I don't have a Windows system with me. I will try setting it
as /cygdrive/d/tmp/ tomorrow when I again have access to a Windows
system and then I'll update the mailing list with the observations.
Thanks for the suggestion.

Regards,
Susam Pal

On Thu, Feb 14, 2008 at 9:41 PM, Dennis Kubes <kubes@apache.org> wrote:
> I think what might be occurring is a file path issue with hadoop.  I
>  have seen it in the past.  Can you try on windows using the cygdrive
>  path and see if that works?  For below it would be /cygdrive/D/tmp/ ...
>
>  Dennis
>
>
>
>  Susam Pal wrote:
>  > I can confirm this error as I just tried running the last revision of
>  > Nutch, rev-620818 on Debian as well as Cygwin on Windows.
>  >
>  > It works fine on Debian but fails on Cygwin with this error:-
>  >
>  > 2008-02-14 19:49:47,756 WARN  regex.RegexURLNormalizer - can\'t find
>  > rules for scope \'inject\', using default
>  > 2008-02-14 19:49:48,381 WARN  mapred.LocalJobRunner - job_local_1
>  > java.io.IOException: Target
>  > file:/D:/tmp/hadoop-guest/mapred/temp/inject-temp-322737506/_reduce_bjm6rw/part-00000
>  > already exists
>  >       at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:246)
>  >       at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125)
>  >       at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:116)
>  >       at org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:196)
>  >       at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:394)
>  >       at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:452)
>  >       at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:469)
>  >       at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:426)
>  >       at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:165)
>  > 2008-02-14 19:49:49,225 FATAL crawl.Injector - Injector:
>  > java.io.IOException: Job failed!
>  >       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:831)
>  >       at org.apache.nutch.crawl.Injector.inject(Injector.java:162)
>  >       at org.apache.nutch.crawl.Injector.run(Injector.java:192)
>  >       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>  >       at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:54)
>  >       at org.apache.nutch.crawl.Injector.main(Injector.java:182)
>  >
>  > Indeed the \'inject-temp-322737506\' is present in the specified
>  > folder of D drive and doesn\'t get deleted.
>  >
>  > Is this because multiple map/reduce is running and one of them is
>  > finding the directory to be present and therefore fails?
>  >
>  > So, I also tried setting this in \'conf/hadoop-site.xml\':-
>  >
>  > <property>
>  > <name>mapred.speculative.execution</name>
>  > <value>false</value>
>  > <description></description>
>  > </property>
>  >
>  > I wonder why the same issue doesn\'t occur in Linux. I am not well
>  > acquainted with the Hadoop code yet. Could someone throw light on what
>  > might be going wrong?
>  >
>  > Regards,
>  > Susam Pal
>  >
>  > On 2/7/08, DS jha <aedsjha@gmail.com> wrote:
>  > Hi -
>  >> Looks like latest trunk version of nutch is failing with the following
>  >> exception when trying to perform inject operation:
>  >>
>  >> java.io.IOException: Target
>  >> file:/tmp/hadoop-user/mapred/temp/inject-temp-1280136828/_reduce_dv90x0/part-00000
>  >> already exists
>  >>         at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:246)
>  >>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125)
>  >>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:116)
>  >>         at org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:196)
>  >>         at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:394)
>  >>         at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:452)
>  >>         at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:469)
>  >>         at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:426)
>  >>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:165)
>  >>
>  >> Any thoughts?
>  >>
>  >> Thanks
>  >> Jha
>  >>
>

Mime
View raw message