ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Goncharuk <alexey.goncha...@gmail.com>
Subject Re: New SQL execution engine
Date Fri, 27 Sep 2019 13:04:16 GMT
Nikolay, Maxim,

Asking to provide a list of issues with the current H2 is pointless because
it has a fundamental architectural flow, not just a bunch of bugs:

Currently, the query execution is limited to a two-phase map-reduce task
(with an optional remote cursor when 'distributed joins' flag is enabled)
and only a limited subset of queries can be executed. You can easily see
that if you try to draw how three non-collocated caches should be joined on
an arbitrary condition.

H2 cannot solve this problem because H2 is a local database and is not
designed to execute distributed queries, let alone the fact that it is not
designed to be embedded to other projects as an execution engine. Because
of this, H2 upgrade is a huge pain which leads to issues up to broken
compilation. This is exactly the reason why the ticket with index use for
IN() expression [1] has only been fixed in 2.7, one can see the amount of
changes needed for a simple version upgrade.

Now, as for alternatives for Apache Calcite - I personally spent quite a
large amount of time looking for alternatives but did not find any even
remotely matching the abilities and flexibility of Calcite, but did not
find any. As folks noted before, Calcite is specifically designed to have
flexible optimization rules and support distributed query execution, which
is already proved by real-life projects. If you have any other framework in
mind that should be considered - please let the community know, I believe
it will be a more productive discussion than now.

As for the IEP content - I agree, we should have a more detailed
description of steps and technical information there, but I believe this
will be improved further.

--AG

[1] https://issues.apache.org/jira/browse/IGNITE-4150



пт, 27 сент. 2019 г. в 15:33, Maxim Muzafarov <mmuzaf@apache.org>:

> Folks,
>
> I agree with Nikolay, the idea of replacing the H2 engine with the
> most suitable one is reasonable. But since such change is major we
> should have a strong argumentation on it even for members with are
> working outside the SQL-team.
>
> I think it is really necessary to have:
>
> 1. The list of issues related to the current engine (H2) which from
> different points of view and for different developers must seem
> unsolvable. For example, `... the H2 execution plan is hard-wired with
> H2 internals and can't be easily transformed` seems doesn't have a
> strong technical argumentation.
> After this step, we should have a clear understanding that the engine
> change is required.
>
> 2. Why only the Apache Calcite? It seems to me we should have a table
> with a comparison of different engines with the pros and cons of each
> other. A brief search shows me that we may have a few options here.
> After this step, we should have a clear understanding of why we choose
> this dependency prior to another.
>
> 3. We should also have a migration decomposition and step by step
> actions to do. I haven't found such a decomposition on IEP-37 page. Do
> we have one? What the implementation phases will be? What components
> will be changed? What a new API would be and would it be? What
> problems we are expecting e.g performance degradation on prototype
> implementation? `Risks and Assumptions` topic doesn't seem to be a
> good described.
> After this step, we should have a clear and obvious a new feature
> implementation plan.
>
> Let's have a strong technical discussion.
>
> On Fri, 27 Sep 2019 at 15:17, Nikolay Izhikov <nizhikov@apache.org> wrote:
> >
> > Hello, Roman.
> >
> > All I see is links to two tickets:
> >
> > IGNITE-11448 - Open
> > IGNITE-6085 - Closed
> >
> > Other issues described poorly and have not ticket links.
> > We can't discuss such a huge change as an execution engine replacement
> with descrition like:
> >
> > "No data co-location control, i.e. arbitrary data can be returned
> silently" or
> > "Low control on how query executes internally, as a result we have
> limited possibility to implement improvements/fixes."
> >
> > I think we need some reproducer that shows issue.
> > Tech details also should be added.
> >
> > Let's make these descriptions more specific.
> > Let's discuss how we want to fix them with the new engine.
> >
> >
> > В Пт, 27/09/2019 в 15:10 +0300, Roman Kondakov пишет:
> > > Hello Nikolay,
> > >
> > > please see IEP--37 [1]. Issues are there.
> > >
> > >
> > > [1]
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084
> > >
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message