gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alfonso Nishikawa <alfonso.nishik...@gmail.com>
Subject Changes to GORA-174 tests
Date Tue, 07 May 2013 22:51:20 GMT
Hi all,

In order to accomplish GORA-174 ([0] GORA compiler does not handle
["string", "null"] unions in the AVRO schema), it has been noticed by Lewis
that we ("I" specially ;) should stick to the requirements of the issue.
With no doubt this is true!

I would want to open a short (short short!) debate about that specification
because I fee reluctant until an acknowledge (and Lewis suggested to ask to
all). Here is Nutch's WebPage schema as example:

  "type": "record",
  "name": "WebPage",
  "namespace": "org.apache.gora.examples.generated",
  "fields" : [
    {"name": "url", "type": "string"},
    {"name": "content", "type": ["null","bytes"]},
    {"name": "parsedContent", "type": {"type":"array", "items": "string"}},
    {"name": "outlinks", "type": {"type":"map", "values":"string"}},
    {"name": "metadata", "type": {
      "name": "Metadata",
      "type": "record",
      "namespace": "org.apache.gora.examples.generated",
      "fields": [
        {"name": "version", "type": "int"},
        {"name": "data", "type": {"type": "map", "values": "string"}}

At this moment I saw that in the original issue NUTCH-1477 [1] the problem
was about a ["null","bytes"], so I think we must not stick to solving only

In the schema shown here will happen that "metadata" is mandatory and
GORA-174 does not talk about optional records. Maybe we should fix that too.

Another more thing: ["null","string"] requirement implies that nested
records must handle it too. In the example above, "Metadata : data" should
allow a map of ["null","string"], and *lets suppoose "Metadata : version"
was String*. allow "Metadata : version of type ["null","string"].

If this is not desired, will have to redefine the issue requisites. For
example something like: "allow [null,String] on topmost records fields".

Taking ONLY GORA-174 title: ["null","string"] I will have to make this

- Modify Nutch's webpage.avsc. "Content" will have to be mandatory :(
- Modify tests. Specifically testGetNested() to check nested
["null","strings"]. I think Cassandra module does will not pass this test.

Lewis told about creating other issues for nested and mutitype-unions. It's
not my view, but I agree the common decision :)


Thanks at least for reading and getting to this line! :)


Alfonso Nishikwa

[0] - https://issues.apache.org/jira/browse/GORA-174
[1] - https://issues.apache.org/jira/browse/NUTCH-1477

"Drinking bloody marys all night will make you feel like a corpse in the

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message