hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (MAPREDUCE-2715) submitAndMonitorJob() doesn't play nice with MultipleOutputFile
Date Wed, 27 Jul 2011 08:26:09 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Harsh J resolved MAPREDUCE-2715.

    Resolution: Not A Problem

Geoffrey, so taking a look at stable release today, 0.22 and trunk, I think we can close this
as a 'Not a Problem' as the directory check is purely from the OutputFormat class instance
itself. That said, you should be fully able to remove that check yourself in your MultipleOutputFormat
derivative by overriding the checkOutputSpecs method as pointed before.

In case that doesn't resolve it for you, do reopen!

> submitAndMonitorJob() doesn't play nice with MultipleOutputFile
> ---------------------------------------------------------------
>                 Key: MAPREDUCE-2715
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2715
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Geoffrey Young
> part of submitAndMonitorJob() balks if the output directory currently exists but is non-empty:
>   "Error launching job , Output path already exists : "
> this logic actually conflicts with the ideas behind MultipleOutputFile, where the output
file path is calculated later on.
> it would be really nice to remove the restriction for non-empty output directories in
submitAndMonitorJob() so that MultipleOutputFile becomes more useful - as it stands now, I
can't, for example, specify a base output path then use MutlipleOutputFile to partition by
date on a daily basis.
> thanks.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message