hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13010) Refactor raw erasure coders
Date Tue, 19 Apr 2016 01:31:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246922#comment-15246922

Colin Patrick McCabe commented on HADOOP-13010:

{{ErasureCoderConf#setCoderOption}} / {{ErasureCoderConf#getCoderOption}}: I don't see why
we need to have these.  If these options are generic to all erasure encoders, then they can
just go as "regular java fields" like  {{ErasureCoderConf#numDataUnits}}, etc.  On the other
hand, if these options only apply to one type of Coder, then they should be stored in the
particular type of coder they apply to.

The usual way to do this is to have your Encoder / Decoder class take a Configuration object
as an argument, and pull out whatever values it needs.
For example, you might have code like this:
FoobarEncoder(Configuration conf) {
  this.coderConf = new ErasureCoderConf(conf);
  this.foobarity = conf.getLong("foobarity", 123);

The idea is that things that are specific to a class go in that class, rather than trying
to handle it with casts to and from Object.

Also, mutable configuration is unpleasant (what happens if you call {{ErasureCoderConf#setCoderOption}}
when the Encoder / Decoder has already been created?  It seems like what we actually want
to do in this case is not modify the configuration, but build a new Encoder / Decoder with
a new configuration.

> Refactor raw erasure coders
> ---------------------------
>                 Key: HADOOP-13010
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13010
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>             Fix For: 3.0.0
>         Attachments: HADOOP-13010-v1.patch, HADOOP-13010-v2.patch, HADOOP-13010-v3.patch
> This will refactor raw erasure coders according to some comments received so far.
> * As discussed in HADOOP-11540 and suggested by [~cmccabe], better not to rely class
inheritance to reuse the codes, instead they can be moved to some utility.
> * Suggested by [~jingzhao] somewhere quite some time ago, better to have a state holder
to keep some checking results for later reuse during an encode/decode call.
> This would not get rid of some inheritance levels as doing so isn't clear yet for the
moment and also incurs big impact. I do wish the end result by this refactoring will make
all the levels more clear and easier to follow.

This message was sent by Atlassian JIRA

View raw message