hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Templeton (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-6837) Add an equivalent to Crunch's Pair class
Date Thu, 26 Jan 2017 15:10:24 GMT
Daniel Templeton created MAPREDUCE-6837:

             Summary: Add an equivalent to Crunch's Pair class
                 Key: MAPREDUCE-6837
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6837
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: mrv2
            Reporter: Daniel Templeton

Crunch has this great {{Pair}} class (https://crunch.apache.org/apidocs/0.14.0/org/apache/crunch/Pair.html)
that save you from constantly implementing composite writables.  It seems silly that we still
don't have an equivalent in MR.

I would like to see a new class with the following API:

package org.apache.hadoop.io;

public class CompositeWritable<P extends WritableComparable, S extends WritableComparable>
implements WritableComparable<CompositeWritable> {
  public CompositeWritable(P primary, S secondary);
  public P getPrimary();
  public void setPrimary(P primary);
  public S getSecondary();
  public void setSecondary(S secondary);

  // Return true if both primaries and both secondaries are equal
  public boolean equals(CompositeWritable o);

  // Return the primary's hash code
  public long hashCode();

  // Sort first by primary and then by secondary
  public int compareTo(CompositeWritable o);

  public void readFields(DataInput in);
  public void write(DataOutput out);

With such a class, implementing a secondary sort would mean just implementing a custom grouping
comparator.  That comparator could be implemented as part of this JIRA:

package org.apache.hadoop.io;

public class CompositeGroupingComparator extends WritableComparator {

Or some such.

Crunch also provides {{Tuple3}}, {{Tuple4}}, and {{TupleN}} classes, but I don't think we
need to add equivalents.  If someone really wants that capability, they can nest composite

Don't forget to add unit tests!

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org

View raw message