calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sarnath K <stell...@gmail.com>
Subject Re: Row Lineage - implementation advice
Date Sun, 19 Feb 2017 05:21:03 GMT
Ah.. Thanks...

On Feb 19, 2017 09:40, "jordan.halterman@gmail.com" <
jordan.halterman@gmail.com> wrote:

> Sure, you can join two tables together and then e.g. add a number from the
> left table to a number from the right table.
>
> SELECT a.foo + b.bar FROM a, b
>
> That outputs a relation with a column whose origin is two separate columns
> in two separate tables. Even without the join, the origin may be two
> columns in one table.
>
> > On Feb 18, 2017, at 7:49 PM, Sarnath K <stellium@gmail.com> wrote:
> >
> > Just curious...How can a column have multiple origins? Join key type
> > scenarios where they have the same value regardless of where they
> originate
> > from?
> >
> > On Feb 19, 2017 09:18, "jordan.halterman@gmail.com" <
> > jordan.halterman@gmail.com> wrote:
> >
> >> You can often get the original of a column via RelMetadataQuery.
> getColumnOrigin(),
> >> but keep in mind columns can have multiple origins or no origin at all.
> >>
> >>> On Feb 18, 2017, at 5:47 PM, barry squire <bmsquire@gmail.com> wrote:
> >>>
> >>> Hi everyone,
> >>>
> >>> Calcite's SQL parsing, planning and execution using the enumerators
> >> module
> >>> looks like a pretty good fit for an application I want to develop.
> >> However,
> >>> I have one requirement that I'd really appreciate some guidance on.
> >>>
> >>> Given an SQL query, I'm looking for a way to trace the lineage of a row
> >>> from source tables, through each operator and eventually to the
> results.
> >>> This is so I can produce something similar to pig illustrate (
> >>> http://research.yahoo.com/files/paper_5.pdf).
> >>>
> >>> I'm still very new to Calcite, but if I understand it correctly, I
> think
> >>> could modify the BlockStatement that is generated in each
> >>> Enumerator.implement function to track lineage between input and output
> >>> rows. This doesn't seem wise though as it would lead to a custom fork
> >> that
> >>> I'd need to maintain.
> >>>
> >>> Can anyone provide some insight into the best way to approach this
> >> problem?
> >>>
> >>> Thanks
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message