drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <julianh...@gmail.com>
Subject Re: logical plan design coming together
Date Fri, 12 Oct 2012 18:32:09 GMT
For those implementing parsing & validation of the query language. Please let me share
my hard-earned wisdom...

1. Separate parsing and validation. The parser should do the absolute minimum of validation.
Don't try to validate identifiers. Don't do any type-checking. It will make errors better
('This function needs a boolean parameter' versus 'Expecting "true" or "false" or "<token>
and" or 101 other possibilities'.) And allows the parser to stay focused on one task which
is difficult enough: converting text into a parse tree.

2. During the validation phase, do not modify the parse tree. If you need to annotate each
node with a type, put it into a map from parse tree node -> type, not into a field in each
node. Put any state you need (e.g. scope for resolving identifiers) into a temporary state
that exists only during validation (think of the visitor pattern). And definitely do not do
any tree-surgery. If you need to rewrite the tree, do it post validation. (In the planner,
or just before planning, is a good time.) See http://en.wikipedia.org/wiki/Immutable_object.


On Oct 12, 2012, at 10:34 AM, Ted Dunning <ted.dunning@gmail.com> wrote:

> Great comments.
> One particular high-level comment that Julian made is a criticism that I
> have made in the past of other projects.  It is probably good for my
> character to be on the receiving side of this criticism for once.
> The question is why should we use/invent a new concrete syntax when JSON
> would do just as well (I am dropping the XML part of the suggestion due to
> known prejudices on this list).
> I don't have a good answer to this question.  It makes certain problems
> quite a bit easier.  Moreover, I have said in the past that it is nuts to
> re-invent concrete syntax for config files and extension languages like
> this.
> My course going forward is that I think I will put down both syntaxes and
> let folks form their own opinion.  Using JSON will definitely move things
> ahead more quickly since other folks have done the parser for us.
> On Fri, Oct 12, 2012 at 12:05 AM, Julian Hyde <julianhyde@gmail.com> wrote:
>> Ted,
>> Great start. I've made some comments on the doc.
>> Julian
>> On Oct 11, 2012, at 10:48 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
>>> The design for the logical plan is coming together.  Anybody should be
>> able
>>> to get to the interim design document at
>> https://docs.google.com/document/d/1QTL8warUYS2KjldQrGUse7zp8eA72VKtLOHwfXy6c7I/edit
>>> You should also be able to see the discussion so far.  Many thanks to
>>> Timothy Chen for kibitzing very well as I wrote.  His astute observations
>>> and questions were critical.
>>> I have to go sleep now, but it would be great to see progress on this
>> while
>>> I sleep.  Remember that comments and questions are as valuable (or more
>> so)
>>> than text.  Remember also, this document has a complete history so we can
>>> reconstruct it no matter what happens.
>>> I would particularly like eyes on this (if practical) from Camuel, Jason,
>>> Gera and Julian Hyde.  They have had some very good thoughts about this
>>> layer in the past and probably will spot several errors in what I have
>>> written.
>>> The plan for this document as it stabilizes is to put it into the
>> web-site
>>> under the documentation area.  WE will probably want to do that before it
>>> really is done to make sure that people can find it easily and to ensure
>> a
>>> checkpoint is in Apache-land.
>>> See y'all tomorrow.

View raw message