ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergi Vladykin <sergi.vlady...@gmail.com>
Subject Re: Distributive SQL Joins
Date Tue, 11 Aug 2015 09:25:03 GMT
I've created respective Jira issue
https://issues.apache.org/jira/browse/IGNITE-1232

Sergi

2015-08-11 9:31 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:

> On Mon, Aug 10, 2015 at 11:26 PM, Sergi Vladykin <sergi.vladykin@gmail.com
> >
> wrote:
>
> > I was thinking about protecting users from doing stupid things,
> > but ok, we can do a broadcast as well.
> >
>
> I think we should print out a warning in this case for sure. Is there a
> Jira created for this? We should provide a link to this discussion there.
>
>
> >
> > Sergi
> >
> > 2015-08-08 3:46 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:
> >
> > > Sergi,
> > >
> > > I personally don't like that for certain types of queries we will be
> > > throwing an exception.
> > >
> > > After analyzing the approaches you suggested, I can think of cases
> where
> > A
> > > performs better than B, as well as when B performs better than A.
> > >
> > > However, if you prefer B, I don't mind us taking that approach. As you
> > have
> > > mentioned yourself, in case of non-collocated non-affinity-ID queries,
> > you
> > > would require a broadcast which is a performance hit. I still vote that
> > we
> > > take this performance hit and do the broadcast (optimized with
> batching,
> > of
> > > course), and execute the query instead of throwing an exception.
> > >
> > > D.
> > >
> > > On Fri, Aug 7, 2015 at 3:40 AM, Sergi Vladykin <
> sergi.vladykin@gmail.com
> > >
> > > wrote:
> > >
> > > > Alexey,
> > > >
> > > > 1. Yes, in my plan it should work exactly like that: if both keys in
> > join
> > > > are affinity keys, then we are fully collocated, if only one then we
> > can
> > > > run join remotely as described, if none of them we will fail to run
> the
> > > > query.
> > > >
> > > > 2. I mean we don't have values for these affinity keys in our local
> > query
> > > > result to map requests to remote nodes.
> > > > Example:
> > > > Lest say we have 4 partitioned tables:
> > > > - Organization(id) with affinity key `id`.
> > > > - Person(id, orgId, name) with affinity key `orgId` (it means that it
> > > will
> > > > be collocated with `Organization`)
> > > > - Manufacturer(id) with affinity key `id`.
> > > > - Purchase(id, personId, manufId) with affinity key `manufId`
> > (collocated
> > > > with `Manufacturer`)
> > > >
> > > > As you can see `Purchase` has a reference to a `Person` and we may
> want
> > > to
> > > > join them by this reference in a query like this:
> > > >
> > > > SELECT pe.name FROM Person pe JOIN Purchase pu ON pe.id =
> pu.personId
> > > > WHERE
> > > > pu.id = ?
> > > >
> > > > as you can see neither `pe.id` nor `pu.personId` is an affinity key
> > > here.
> > > > But if the `Person` has affinity key `id` and thus is not collocated
> > with
> > > > `Organization`
> > > > we can run query on `Purchase`, take value of `personId` and find the
> > > > affinity node to get the needed `Person`.
> > > >
> > > > Of course it is a restriction but there are multiple ways to
> workaround
> > > it,
> > > > so I don't think it is really a problem:
> > > >
> > > > 1. Use primary key as affinity key if table is used in such joins.
> This
> > > way
> > > > `Person` still can be joined to `Organization`
> > > > (less effective though) and `Pusrchase` can be joined to `Person` as
> > > well.
> > > > 2. Use denormalization: instread of having `Purchase.personId` store
> > > > `Person` object itself there.
> > > > 3. Introduce another entity which can duplicate data from `Person`
> but
> > > have
> > > > collocation needed for this failing query:
> > > > For our example it can be an entity PersonForPurchace(id, manufId,
> > name)
> > > > with the same affinity key `manufId` as `Purchase`.
> > > > This way our query can be rewritten in fully collocated style:
> > > >
> > > > SELECT pe.name FROM PersonForPurchace pe JOIN Purchase pu ON pe.id =
> > > > pu.personId AND pu.manufId = pe.manufId WHERE pu.id = ?
> > > >
> > > > Sergi
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > 2015-08-07 11:42 GMT+03:00 Alexey Kuznetsov <akuznetsov@gridgain.com
> >:
> > > >
> > > > > Sergi,
> > > > >
> > > > > Questions about plan "B" :)
> > > > > 1) It is possible to throw exception on query prepare state (fail
> > fast)
> > > > > when
> > > > > we don't know remote affinity key?
> > > > > 2) Could you provide an example when we don't know remote affinity
> > > key? I
> > > > > think we always have some default affinity (no?)?
> > > > >
> > > > > --
> > > > > Alexey Kuznetsov
> > > > > GridGain Systems
> > > > > www.gridgain.com
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message