beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Work logged] (BEAM-5092) Nexmark 10x performance regression
Date Wed, 08 Aug 2018 20:02:00 GMT

     [ https://issues.apache.org/jira/browse/BEAM-5092?focusedWorklogId=132570&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-132570
]

ASF GitHub Bot logged work on BEAM-5092:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Aug/18 20:01
            Start Date: 08/Aug/18 20:01
    Worklog Time Spent: 10m 
      Work Description: timrobertson100 commented on a change in pull request #6176: [[BEAM-5092]
Row comparison should be faster when both are POJOs.
URL: https://github.com/apache/beam/pull/6176#discussion_r208715945
 
 

 ##########
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/RowWithGetters.java
 ##########
 @@ -123,4 +124,27 @@ public int getFieldCount() {
   public Object getGetterTarget() {
     return getterTarget;
   }
+
+  @Override
+  public boolean equals(Object o) {
+    if (this == o) {
+      return true;
+    }
+    if (o == null) {
+      return false;
+    }
+    if (o instanceof RowWithGetters) {
+      RowWithGetters other = (RowWithGetters) o;
+      return Objects.equals(getSchema(), other.getSchema())
+          && Objects.equals(getterTarget, other.getterTarget);
 
 Review comment:
   Will this always produce results we want? I'm thinking aloud here...
   
   Presumably our objective here is to determine if two `RowWithGetters` are equal. 
   
   Since `getterTarget` is outside our control there are cases that we should consider:
    1. it might not have `equals` implemented correctly
    2. truly differing objects could produce rows that are equal for the scope of the schema
(e.g. have extra fields ignored in the schema) 
    3. if the target is an array should it be using `Objects.deepEquals()`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 132570)
    Time Spent: 4.5h  (was: 4h 20m)

> Nexmark 10x performance regression
> ----------------------------------
>
>                 Key: BEAM-5092
>                 URL: https://issues.apache.org/jira/browse/BEAM-5092
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-core
>            Reporter: Andrew Pilloud
>            Assignee: Reuven Lax
>            Priority: Critical
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> There looks to be a 10x performance hit on the DirectRunner and Flink nexmark jobs.
It first showed up in this build:
> [https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Direct/151/changes]
> [https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424]
> [https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message