gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alfonso Nishikawa <alfonso.nishik...@gmail.com>
Subject Changes to GORA-174 tests
Date Tue, 07 May 2013 22:51:20 GMT
Hi all,

In order to accomplish GORA-174 ([0] GORA compiler does not handle
["string", "null"] unions in the AVRO schema), it has been noticed by Lewis
that we ("I" specially ;) should stick to the requirements of the issue.
With no doubt this is true!

I would want to open a short (short short!) debate about that specification
because I fee reluctant until an acknowledge (and Lewis suggested to ask to
all). Here is Nutch's WebPage schema as example:

{
  "type": "record",
  "name": "WebPage",
  "namespace": "org.apache.gora.examples.generated",
  "fields" : [
    {"name": "url", "type": "string"},
    {"name": "content", "type": ["null","bytes"]},
    {"name": "parsedContent", "type": {"type":"array", "items": "string"}},
    {"name": "outlinks", "type": {"type":"map", "values":"string"}},
    {"name": "metadata", "type": {
      "name": "Metadata",
      "type": "record",
      "namespace": "org.apache.gora.examples.generated",
      "fields": [
        {"name": "version", "type": "int"},
        {"name": "data", "type": {"type": "map", "values": "string"}}
      ]
    }}
  ]
}

At this moment I saw that in the original issue NUTCH-1477 [1] the problem
was about a ["null","bytes"], so I think we must not stick to solving only
["null","string"].

In the schema shown here will happen that "metadata" is mandatory and
GORA-174 does not talk about optional records. Maybe we should fix that too.

Another more thing: ["null","string"] requirement implies that nested
records must handle it too. In the example above, "Metadata : data" should
allow a map of ["null","string"], and *lets suppoose "Metadata : version"
was String*. allow "Metadata : version of type ["null","string"].

If this is not desired, will have to redefine the issue requisites. For
example something like: "allow [null,String] on topmost records fields".

===============
Taking ONLY GORA-174 title: ["null","string"] I will have to make this
modifications:

- Modify Nutch's webpage.avsc. "Content" will have to be mandatory :(
- Modify tests. Specifically testGetNested() to check nested
["null","strings"]. I think Cassandra module does will not pass this test.
===============

Lewis told about creating other issues for nested and mutitype-unions. It's
not my view, but I agree the common decision :)

Opinions?

Thanks at least for reading and getting to this line! :)

Regards,

Alfonso Nishikwa

[0] - https://issues.apache.org/jira/browse/GORA-174
[1] - https://issues.apache.org/jira/browse/NUTCH-1477

-- 
"Drinking bloody marys all night will make you feel like a corpse in the
morning."

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message