drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <julianh...@gmail.com>
Subject Purpose of Scan.selection and Operator.ref?
Date Mon, 28 Jan 2013 23:40:28 GMT
Hello drillers,

I'm still puzzling the purpose of the "selection" attribute of the "Scan" operator and the
"ref" attribute of various operators such as "Scan", "Transform", "Group".

I notice that "selection" is not used (which is good, since there is no "activity" attribute
in donuts.json).

I understand that "ref" chooses the output expression(s) of each operator, and see those expressions
are necessary. But I don't understand why every "ref" in simple_plan.json is prefixed with
"donuts".

My understanding is that each operator's input and output is a JSON array. The elements of
that array (the "rows" in SQL parlance) are usually JSON objects (i.e. records with named
fields) but might sometimes be scalars or arrays.

The output of the "aggregate" operator in simple_plan.json would be something like

[
  {
    "donuts": {
      "sales" : 1099.22,
      "typeCount" : 1,
      "quantity" : 10000,
      "ppu" : 0.11
  },
  {
    "donuts": {
      "sales" : 109.71,
      "typeCount" : 2,
      "quantity" : 159,
      "ppu" : 0.69
    }
  },
  {
    "donuts": {
      "sales" : 184.25,
      "typeCount" : 2,
      "quantity" : 335,
      "ppu" : 0.55
  }
]

The output is a list of objects, each of which has just one field "donuts", whose value is
an object. The only purpose of the "donuts" prefix is to increase the nesting level. And other
operators do the same thing. It would seem to me more natural to just use one level of nesting:

[
  {
    "sales" : 1099.22,
    "typeCount" : 1,
    "quantity" : 10000,
    "ppu" : 0.11
  },
  ...
]

Of course it's not wrong to do this, but I wanted to ask why someone would choose an extra
level of nesting. Or to check whether my understanding was wrong. (I'm pondering how to make
a SQL front-end generate something like simple_plan.json and right now I can see no reason
why it would generate a ref values with a "donuts." prefix.)

Is the intent of "selection" to remove a level of nesting when reading a source?

Julian
Mime
View raw message