ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Setrakyan <dsetrak...@apache.org>
Subject Re: nested SQL sub-queries with non-collocated joins
Date Thu, 01 Jun 2017 18:33:58 GMT
Sergi,

I am OK with any improvement here, but we need to be able to clearly state
to a user what is supported and what is not. If we cannot clearly describe
it, I would rather not support it at all and throw an exception.

Is this going to be possible with your solution?

D.

On Thu, Jun 1, 2017 at 2:51 AM, Sergi Vladykin <sergi.vladykin@gmail.com>
wrote:

> The approach you are suggesting will be very complex for current
> implementation. Also most probably very inefficient.
>
> Actually I was thinking about another but similar approach: in many cases
> we can rewrite a subquery in WHERE clause into JOIN subquery.
>
> Like the following:
>
> SELECT x.* FROM x WHERE x.a = (SELECT MAX(y.a) FROM y WHERE y.b = x.b)
>
>  ===>
>
> SELECT x.* FROM x, (SELECT MAX(y.a), y.b FROM y GROUP BY y.b) z WHERE x.b =
> z.b
>
> There are still problems here:
>
> 1. We will not be able to rewrite all the queries.
> 2. We should not rewrite queries like this by default because this will
> have a noticeable performance penalty for correctly collocated subqueries.
> Probably we will need some flag for that.
>
> Sergi
>
> 2017-05-31 21:26 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:
>
> > Igniters (specifically Sergi),
> >
> > It has come to my attention today that nested sub-select statements, when
> > used in combination with non-collocated joins do not work properly in
> > Ignite.
> >
> > So a query like this, where A, B, and C are all stored in Partitioned
> > caches and are **not** collocated at all, will not work.
> >
> >
> > > *select * from A, B where a.id <http://a.id> = b.a_id and b.somefield
> in
> > > (select somefield from C where c.zipcode = ?)*
> >
> >
> > The main reason it is not supported right now is because, in the absence
> of
> > collocation, such query may create N^N complexity and it was decided that
> > it is best not supporting it at all.
> >
> > However, I am not sure why N^N complexity is required. Why not support it
> > as follows?
> >
> >    1. execute the nested subquery and store the result in a temporary
> >    Replicated table.
> >    2. execute the original query and use the temporary Replicated table
> >    instead of the sub-query.
> >
> > Sergi, given that you are the author of the code, can you provide some
> > insight here?
> >
> > Thanks,
> > D.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message