calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: Using Calcite's parser for similar query analysis?
Date Fri, 04 Dec 2015 22:50:55 GMT
Yes, I think you idea of an equivalence set is what I'm looking for.

On Mon, Nov 30, 2015 at 11:22 AM, Julian Hyde <jhyde@apache.org> wrote:

> I think there are divergent definitions of “similar”, and it helps if you
> state why you want to detect similar queries.
>
> For example, by one definition, queries are similar if they are the same
> modulo substituting constants. For these queries,  you can share plans if
> you hoist literals[1].
>
> Other queries are similar if they use the same underlying tables. (The
> goal here might be security or compliance.)
>
> Another definition of similarity is “can I satisfy query A using the
> results of query B”? Materialized view rewrite is interested in this.
>
> For some kinds of similarity, I would work on the AST (i.e. SqlNode). For
> other, deeper, kinds of similarity I would recommend working at the algebra
> level (i.e. RelNode).
>
> Then there is equivalence sets vs. similarity. I think you’re talking
> about equivalence sets. Q1 and Q2 are either in the same equivalence set or
> they are not. If they are, then f(Q1) = f(Q2), where f is some function,
> say a hash function or canonization function. Other kinds of similarity
> (e.g. [2]) might give a “distance” between queries, and queries are deemed
> equivalent iff that distance is 0.
>
> Julian
>
> [1] https://issues.apache.org/jira/browse/CALCITE-963
>
> [2] https://en.wikipedia.org/wiki/Cosine_similarity <
> https://en.wikipedia.org/wiki/Cosine_similarity>
>
>
> > On Nov 30, 2015, at 11:08 AM, Nick Dimiduk <ndimiduk@apache.org> wrote:
> >
> > Hello.
> >
> > I have a pile of SQL queries. Is it possible to use Calcite to group
> these
> > queries into "similar" buckets? Perhaps via some hash function on a
> parsed
> > query? I see some databases' query planners provide this kind of
> > functionality, but mine apparently does not.
> >
> > Thanks,
> > Nick
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message