drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rymar Maksym (Jira)" <j...@apache.org>
Subject [jira] [Created] (DRILL-7812) Broken equals/hashcode contract
Date Wed, 25 Nov 2020 16:05:00 GMT
Rymar Maksym created DRILL-7812:

             Summary: Broken equals/hashcode contract 
                 Key: DRILL-7812
                 URL: https://issues.apache.org/jira/browse/DRILL-7812
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Rymar Maksym
            Assignee: Rymar Maksym

*MaterializedField* class [has broken equals/hashCode contract|https://github.com/apache/drill/blob/31d6086c4f814c1d7fc476095611e37cc3d95d1c/exec/vector/src/main/java/org/apache/drill/exec/record/MaterializedField.java#L192]:

{{If two objects are equal according to the equals(Object) method, then calling the hashCode
method on each of the two objects must produce the same integer result.}}

In our case *{{equals()}}* method depends on 2 fields: name and type. While *{{hashCode()}}*
method depends on 3 fields: name, type and child. This is leading to serious bugs. For example,
it can occurs in *SortRecordBatchBuilder* class [there|https://github.com/apache/drill/blob/31d6086c4f814c1d7fc476095611e37cc3d95d1c/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/sort/SortRecordBatchBuilder.java#L142]
if (batches.keySet().size() > 1) {
   throw UserException.validationError(null)
      .message("Sort currently only supports a single schema.")
*Batches* is *{{ArrayListMultimap<BatchSchema, RecordBatchData> and}}* when *{{RecordBatchData}}*
is insert with *{{BatchSchema}}* key – occurs not expected behaivor, because *{{RecordBatchData}}*
hashCode is based on hashCode of MaterializedField:
public int hashCode() {
  final int prime = 31;
  int result = 1;
  result = prime * result + ((fields == null) ? 0 : fields.hashCode());
  result = prime * result + ((selectionVectorMode == null) ? 0 : selectionVectorMode.hashCode());
  return result;
So *{{RecordBatchData}}* with equals *{{BatchSchema}}* are going to be add to *{{ArrayListMultimap}}*
as different entries. It's not common situation, and most easily can be reproduced with json


This message was sent by Atlassian Jira

View raw message