calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <>
Subject Re: [DISCUSSION] Extension of Metadata Query
Date Tue, 04 Jun 2019 17:48:09 GMT
> 1. Why we have 2 methods in RelMetadataProvider?

The metadata system is complicated. We need to allow multiple handlers
for any given call. So, making a metadata call involves multiple
dispatch [1] based on the metadata method being called, the type of
RelNode, and the handlers that are registered. Also it needs to cache
results, and detect cycles. And the dispatch needs to be efficient, so
we generate janino code to do the dispatch, and re-generate when new
handlers or sub-classes of RelNode are added.

I forget details, the two methods are basically required to allow us
to generate code to do the dispatch.

> 2. We should make the RelMetadataQuery in RelOptCluster pluggable.

I disagree. RelMetadataQuery is not an extension point. Its sole
purpose is to to keep state (the cache and cycle-checking).
RelMetadataProvider is the extension point. (By analogy, if you are
un-parsing an AST, you let each AST sub-class handle unparsing itself,
but the unparsed text goes to a simple StringBuilder. RelMetadataQuery
is in the role of the StringBuilder. In a complex system, it is nice
to keep some of the components simple, or at least keep them to
prescribed roles.)



On Sun, Jun 2, 2019 at 11:19 PM Yuzhao Chen <> wrote:
> Currently we provide answer to metadata query through RelMetadataProvider [1], there
are some sub-classes of it:
> RelMetadataProvider
> |
> |- VolcanoRelMetadataProvider
> |- ChainedRelMetadataProvider/DefaultRelMetadataProvider
> |- HepRelMetadataProvider
> |- CachingRelMetadataProvider
> |- ReflectiveRelMetadataProvider
> |- JaninoRelMetadataProvider
> The RelMetadataProvider has two methods: #apply and #handlers, the #apply method seems
a programming interface and there is a demo code how we can use it:
> RelMetadataProvider provider;
> LogicalFilter filter;
> RexNode predicate;
> Function<RelNode, Metadata> function =
> provider.apply(LogicalFilter.class, Selectivity.class};
> Selectivity selectivity = function.apply(filter);
> Double d = selectivity.selectivity(predicate);
> But let's see our RelOptCluster's member variables[2], there are MetadataFactory and
RelMetadataQuery which all can be used to query the metadata, for MetadataFactory, there is
a default impl named MetadataFactoryImpl which will invoke RelMetadataProvider#apply internally,
for RelMetadataQuery, it will invoke RelMetadataProvider#handlers (finally composed and codeden
by JaninoRelMetadataProvider).
> In our planning phrase, we can invoke RelOptRuleCall#getMetadataQuery to get the MQ and
query the metadata.
> For extension of metadata handlers, we can set our customized RelMetadataProvider in
RelOptCluster[3]. But for RelMetadataQuery, we have no way to extend it now, because the RelOptCluster
always has a singleton instance [4] which is only the default implementation.
> My question is as follows:
> 1. Why we have 2 methods in RelMetadataProvider, and why we need the MetadataFactory
and RelMetadataProvider#apply ? It seems that it's function is already been overriden by RelMetadataQuery(The
difference is that MetadataFactory use Reflection and RelMetadataQuery use gened bytes code).
> 2. We should make the RelMetadataQuery in RelOptCluster pluggable.
> [1]
> [2]
> [3]
> [4]
> Best,
> Danny Chan

View raw message