nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cosmin Lehene (JIRA)" <j...@apache.org>
Subject [jira] Updated: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Date Thu, 02 Apr 2009 19:39:12 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Cosmin Lehene updated NUTCH-692:
--------------------------------

    Attachment: NUTCH-692.patch

This just checks the destination file existence before attempting to create a new output MapFile
for the reduce task in the FetcherOutputFormat and ParseOutputFormat. If the destination files
exist it deletes them. 
The AlreadyBeingCreatedException is thrown when a MapFile creation attempt fails to create
the same file as the previous failed task. 


> AlreadyBeingCreatedException with Hadoop 0.19
> ---------------------------------------------
>
>                 Key: NUTCH-692
>                 URL: https://issues.apache.org/jira/browse/NUTCH-692
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Julien Nioche
>         Attachments: NUTCH-692.patch
>
>
> I have been using the SVN version of Nutch on an EC2 cluster and got some AlreadyBeingCreatedException
during the reduce phase of a parse. For some reason one of my tasks crashed and then I ran
into this AlreadyBeingCreatedException when other nodes tried to pick it up.
> There was recently a discussion on the Hadoop user list on similar issues with Hadoop
0.19 (see http://markmail.org/search/after+upgrade+to+0%2E19%2E0). I have not tried using
0.18.2 yet but will do if the problems persist with 0.19
> I was wondering whether anyone else had experienced the same problem. Do you think 0.19
is stable enough to use it for Nutch 1.0?
> I will be running a crawl on a super large cluster in the next couple of weeks and I
will confirm this issue  
> J.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message