flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Hogan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2716) Checksum method for DataSet and Graph
Date Fri, 09 Oct 2015 14:13:26 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950446#comment-14950446

Greg Hogan commented on FLINK-2716:

{{TaskConfig}} has {{getInputSerializer(int, ClassLoader)}} and {{getInputComparator(int,
ClassLoader)}}. What API could be added to {{RuntimeContext}} other than {{getInputSerializer(int)}}
and {{getInputComparator(int)}}? Modifying the {{RichFunction}} API with no-arg functions
both informs and restricts the user.

I have thought that serializers and comparators were paired, though in {{TaskConfig}} there
is a single {{OutputSerializer}} and multiple {{OutputCompators}}.

> Checksum method for DataSet and Graph
> -------------------------------------
>                 Key: FLINK-2716
>                 URL: https://issues.apache.org/jira/browse/FLINK-2716
>             Project: Flink
>          Issue Type: Improvement
>          Components: Gelly, Java API, Scala API
>    Affects Versions: 0.10
>            Reporter: Greg Hogan
>            Assignee: Greg Hogan
>            Priority: Minor
> {{DataSet.count()}}, {{Graph.numberOfVertices()}}, and {{Graph.numberOfEdges()}} provide
measures of the number of distributed data elements. New {{DataSet.checksum()}} and {{Graph.checksum()}}
methods will summarize the content of data elements and support algorithm validation, integration
testing, and benchmarking.

This message was sent by Atlassian JIRA

View raw message