beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anton Kedin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (BEAM-3574) [SQL] Support schema qualifiers for field names
Date Wed, 31 Jan 2018 06:51:00 GMT

     [ https://issues.apache.org/jira/browse/BEAM-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Anton Kedin updated BEAM-3574:
------------------------------
    Description: 
Currently there are utility methods in BeamRecord to get field values by name, e.g. BeamRecord.getFieldValue(String
name). Internally they call fieldNamesArrayList.indexOf(fieldName) to find the index of the
field name.

This works as long as there is only one field with such name in the record. But when joining
2 records you can end up with duplicate field names, and without any means of distinguishing
them or getting a value from specific field by name. We don't keep any metadata in BeamRecordType
to help identify a field in this case. 

It feels that this can lead to obscure bugs.

We probably should keep more detailed schema information attached to the fields, so that we
could reference them using qualifiers like "[schemaA].[pcollectionB].[fieldC]".

 

  was:
Currently there are utility methods in BeamRecord to get field values by name, e.g. BeamRecord.getFieldValue(String
name). Internally they call fieldNamesArrayList.indexOf(fieldName) to find the index of the
field name.

This works as long as there is only one field with such name in the record. But when joining
2 records you can end up with duplicate field names, and without any means of distinguishing
them and getting a value from specific field by name. We don't keep any metadata in BeamRecordType
to help identify a field in this case. 

It feels that this can lead to obscure bugs.

We probably should keep more detailed schema information attached to the fields, so that we
could reference them using qualifiers like "[schemaA].[pcollectionB].[fieldC]".

 


> [SQL] Support schema qualifiers for field names
> -----------------------------------------------
>
>                 Key: BEAM-3574
>                 URL: https://issues.apache.org/jira/browse/BEAM-3574
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-sql
>            Reporter: Anton Kedin
>            Priority: Major
>
> Currently there are utility methods in BeamRecord to get field values by name, e.g. BeamRecord.getFieldValue(String
name). Internally they call fieldNamesArrayList.indexOf(fieldName) to find the index of the
field name.
> This works as long as there is only one field with such name in the record. But when
joining 2 records you can end up with duplicate field names, and without any means of
distinguishing them or getting a value from specific field by name. We don't keep any metadata
in BeamRecordType to help identify a field in this case. 
> It feels that this can lead to obscure bugs.
> We probably should keep more detailed schema information attached to the fields, so that
we could reference them using qualifiers like "[schemaA].[pcollectionB].[fieldC]".
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message