crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <>
Subject [jira] [Updated] (CRUNCH-263) Create intelligent defaults for the CrunchCombineFileInputFormat
Date Sat, 07 Sep 2013 01:01:37 GMT


Josh Wills updated CRUNCH-263:

    Attachment: CRUNCH-263.patch

Here's a patch that sets the default max split size based on the dfs.block.size for HDFS.
> Create intelligent defaults for the CrunchCombineFileInputFormat
> ----------------------------------------------------------------
>                 Key: CRUNCH-263
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core, IO
>    Affects Versions: 0.8.0
>            Reporter: Josh Wills
>            Assignee: Josh Wills
>             Fix For: 0.8.0
>         Attachments: CRUNCH-263.patch
> The CombineFileInputFormat is too aggressive at combining file blocks unless a sensible
mapred.max.split.size value is set in the Configuration or is explicitly configured on the
CombineFileInputFormat class itself. This JIRA seeks to set intelligent defaults for the max
split size in the case that a default is not already provided and provide a way for Crunch
clients to override the defaults via a configuration parameter.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message