gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alfonso Nishikawa (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (GORA-207) Verify storage and retrieval of Avro Union data type within Gora-HBase
Date Thu, 02 May 2013 11:34:17 GMT

    [ https://issues.apache.org/jira/browse/GORA-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13647449#comment-13647449
] 

Alfonso Nishikawa edited comment on GORA-207 at 5/2/13 11:33 AM:
-----------------------------------------------------------------

Uploaded last patch for this issue: GORA-207_over_r1465660.patch

What this does is write the top level fields of a record (the outermost, not nested ones)
as following:

If schema = ["null","type"] => write as ["type"]
else => write serialized as usual.

I dumped the "optional opt-out configuration, blah, blah, blah" because it was introducing
really innecessary complexiny. No one should want to have the behavior "always serialized",
I expect.

ONE BIG PROBLEM:
The solution assures compatibility of data with Nutch's WebPage schema being updated making
top level fields optionals, but it is not possible to make compatible nested data :(
The solution is much dirty; too much dirty. I will have to write about and we should vote
:\
It is impossible to make compatible nested schemas (I guess almost no one used in that way)
with it's fields being converted to optional (+null type).
If people don't change their schemas, no bad things will happen :)

Any comments about? if no, I will commit and close this issue (GORA-207)
                
      was (Author: alfonso.nishikawa):
    Uploaded last patch for this issue: GORA-207_over_r1465660.patch

What this does is write the top level fields of a record (the outermost, not nested ones)
as following:

If schema = ["null","type"] => write as ["type"]
else => write serialized as usual.

I dumped the "optional configuration, blah, blah, blah" because it was introducing really
innecessary complexiny. No one should want to have the behavior "always serialized", I expect.

ONE BIG PROBLEM:
The solution assures compatibility of data with Nutch's WebPage schema being updated making
top level fields optionals, but it is not possible to make compatible nested data :(
The solution is much dirty; too much dirty. I will have to write about and we should vote
:\
It is impossible to make compatible nested schemas (I guess almost no one used in that way)
with it's fields being converted to optional (+null type).
If people don't change their schemas, no bad things will happen :)

Any comments about? if no, I will commit and close this issue (GORA-207)
                  
> Verify storage and retrieval of Avro Union data type within Gora-HBase
> ----------------------------------------------------------------------
>
>                 Key: GORA-207
>                 URL: https://issues.apache.org/jira/browse/GORA-207
>             Project: Apache Gora
>          Issue Type: Sub-task
>          Components: gora-hbase
>    Affects Versions: 0.3
>            Reporter: Renato Javier MarroquĂ­n Mogrovejo
>            Assignee: Alfonso Nishikawa
>             Fix For: 0.3
>
>         Attachments: GORA-207_over_r1462580.patch, GORA-207_over_r1463759.patch, GORA-207_over_r1465660.patch,
GORA-207.patch, GORA-207.patch, GORA-207.patch
>
>
> The necessary features should be added to confirm that we are able to support Avro Union
data types

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message