ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergi Vladykin <sergi.vlady...@gmail.com>
Subject Re: nested SQL sub-queries with non-collocated joins
Date Thu, 01 Jun 2017 19:32:09 GMT
I guess it must work the following way:

If distributed joins are enabled we can try to prove that the subquery is
collocated, if we can't then try to rewrite it, if we can't, then throw an
exception.

Still this can not be done 100% correct, probably we have to have some flag
which allows to disable this subquery rewriting.

Sergi

2017-06-01 21:33 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:

> Sergi,
>
> I am OK with any improvement here, but we need to be able to clearly state
> to a user what is supported and what is not. If we cannot clearly describe
> it, I would rather not support it at all and throw an exception.
>
> Is this going to be possible with your solution?
>
> D.
>
> On Thu, Jun 1, 2017 at 2:51 AM, Sergi Vladykin <sergi.vladykin@gmail.com>
> wrote:
>
> > The approach you are suggesting will be very complex for current
> > implementation. Also most probably very inefficient.
> >
> > Actually I was thinking about another but similar approach: in many cases
> > we can rewrite a subquery in WHERE clause into JOIN subquery.
> >
> > Like the following:
> >
> > SELECT x.* FROM x WHERE x.a = (SELECT MAX(y.a) FROM y WHERE y.b = x.b)
> >
> >  ===>
> >
> > SELECT x.* FROM x, (SELECT MAX(y.a), y.b FROM y GROUP BY y.b) z WHERE
> x.b =
> > z.b
> >
> > There are still problems here:
> >
> > 1. We will not be able to rewrite all the queries.
> > 2. We should not rewrite queries like this by default because this will
> > have a noticeable performance penalty for correctly collocated
> subqueries.
> > Probably we will need some flag for that.
> >
> > Sergi
> >
> > 2017-05-31 21:26 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:
> >
> > > Igniters (specifically Sergi),
> > >
> > > It has come to my attention today that nested sub-select statements,
> when
> > > used in combination with non-collocated joins do not work properly in
> > > Ignite.
> > >
> > > So a query like this, where A, B, and C are all stored in Partitioned
> > > caches and are **not** collocated at all, will not work.
> > >
> > >
> > > > *select * from A, B where a.id <http://a.id> = b.a_id and
> b.somefield
> > in
> > > > (select somefield from C where c.zipcode = ?)*
> > >
> > >
> > > The main reason it is not supported right now is because, in the
> absence
> > of
> > > collocation, such query may create N^N complexity and it was decided
> that
> > > it is best not supporting it at all.
> > >
> > > However, I am not sure why N^N complexity is required. Why not support
> it
> > > as follows?
> > >
> > >    1. execute the nested subquery and store the result in a temporary
> > >    Replicated table.
> > >    2. execute the original query and use the temporary Replicated table
> > >    instead of the sub-query.
> > >
> > > Sergi, given that you are the author of the code, can you provide some
> > > insight here?
> > >
> > > Thanks,
> > > D.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message