ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piotr Romański <piotr.roman...@gmail.com>
Subject Re: Continuous queries and duplicates
Date Fri, 11 Jan 2019 14:23:44 GMT
Hi Vladimir, thank you for your response. I tested the current behaviour
and it seems that the order is maintained for notifications within a
partition. Unfortunately, I don’t know how it would behave in exceptional
situations like losing partitions, rebalancing etc. Do you think it would
be possible to make that ordering guarantee to be a part of the Ignite API?
What I would really need is to have order for notifications sharing the
same affinity key, not even a partition. So I think it wouldn’t require any
cross-node ordering.

Thank you,

Piotr

śr., 9 sty 2019, 21:11: Vladimir Ozerov <vozerov@gridgain.com> napisał(a):

> Hi,
>
> MVCC caches have the same ordering guarantees as non-MVCC caches, i.e. two
> subsequent updates on a single key will be delivered in proper order. There
> is no guarantees  Order of updates on two subsequent transactions affecting
> the same partition may be guaranteed with current implementation (though. I
> am not sure), but even if it is so, I am not aware that this was ever our
> design goal. Most likely, this is an implementation artifact which may be
> changed in future. Cache experts are needed to clarify this.
>
> As far as MVCC, data anomalies are still possible in current
> implementation, because we didn't rework initial query handling in the
> first iteration, because technically this is not so simple as we thought.
> Once snapshot is obtained, query over that snapshot will return a data set
> consistent at some point in time. But the problem is that there is a time
> frame between snapshot acquisition and listener installation (or vice
> versa), what leads to either duplicates or lost entries. Some multi-step
> listener installation will be required here. We haven't designed it yet.
>
> Vladimir.
>
>
>
> On Mon, Dec 24, 2018 at 10:06 PM Denis Magda <dmagda@apache.org> wrote:
>
> > >
> > > In my case, values are immutable - I never change them, I just add new
> > > entry for newer versions. Does it mean that I won't have any duplicates
> > > between the initial query and listener entries when using continuous
> > > queries on caches supporting MVCC?
> >
> >
> > I'm afraid there still might be a race. Val, Vladimir, other Ignite
> > experts, please confirm.
> >
> > After reading the related thread (
> > >
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html
> > > )
> > > I'm now concerned about the ordering. My case assumes that there are
> > groups
> > > of entries which belong to a business aggregate object and I would like
> > to
> > > make sure that if I commit two records in two serial transactions then
> I
> > > have notifications in the same order. Those entries will have different
> > > keys so based on what you said ("we'd better to leave things as is and
> > > guarantee only per-key ordering"), it would seem that the order is not
> > > guaranteed. But do you think it would possible to guarantee order when
> > > those entries share the same affinity key and they belong to the same
> > > partition?
> >
> >
> > The order should be the same for key-value transactions. Vladimir, could
> > you clear out MVCC based behavior?
> >
> > --
> > Denis
> >
> > On Mon, Dec 17, 2018 at 9:55 AM Piotr Romański <piotr.romanski@gmail.com
> >
> > wrote:
> >
> > > Hi all, sorry for answering so late.
> > >
> > > I would like to use SqlQuery because I can leverage indexes there.
> > >
> > > As it was already mentioned earlier, the partition update counter is
> > > exposed through CacheQueryEntryEvent. Initially, I thought that the
> > > partition update counter is something what's persisted together with
> the
> > > data but I'm guessing now that this is only a part of the notification
> > > mechanism.
> > >
> > > I imagined that I would be able to implement my own deduplicaton by
> > having
> > > 3 stages on the client side: 1. Keep processing initial query results,
> > > store their keys in memory, 2. When initial query is over, then process
> > > listener entries but before that check if they have been already
> > delivered
> > > in the first stage, 3. When we are sure that we are already processing
> > > notifications for commits executed after initial query was done, then
> we
> > > can process listener entries without any additional checks (so our key
> > set
> > > from stage 1 can be removed from memory). The problem is that I have no
> > way
> > > to say that I can move from stage 2 to 3. Another problem is that we
> need
> > > to stash listener entries while still processing initial query results
> > > causing an excessive memory pressure on our client.
> > >
> > > In my case, values are immutable - I never change them, I just add new
> > > entry for newer versions. Does it mean that I won't have any duplicates
> > > between the initial query and listener entries when using continuous
> > > queries on caches supporting MVCC?
> > >
> > > After reading the related thread (
> > >
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html
> > > )
> > > I'm now concerned about the ordering. My case assumes that there are
> > groups
> > > of entries which belong to a business aggregate object and I would like
> > to
> > > make sure that if I commit two records in two serial transactions then
> I
> > > have notifications in the same order. Those entries will have different
> > > keys so based on what you said ("we'd better to leave things as is and
> > > guarantee only per-key ordering"), it would seem that the order is not
> > > guaranteed. But do you think it would possible to guarantee order when
> > > those entries share the same affinity key and they belong to the same
> > > partition?
> > >
> > > Piotr
> > >
> > > pt., 14 gru 2018, 19:31: Denis Magda <dmagda@apache.org> napisał(a):
> > >
> > > > Vladimir,
> > > >
> > > > Thanks for referring to the MVCC and Continuous Queries discussion, I
> > > knew
> > > > that saw us discussing a solution of the duplication problem. Let me
> > copy
> > > > and paste it in here for others:
> > > >
> > > > 2) *Initial query*. We implemented it so that user can get some
> initial
> > > > > data snapshot and then start receiving events. Without MVCC we have
> > no
> > > > > guarantees of visibility. E.g. if key is updated from V1 to V2, it
> is
> > > > > possible to see V2 in initial query and in event. With MVCC it is
> now
> > > > > technically possible to query data on certain snapshot and then
> > receive
> > > > > only events happened after this snapshot. So that we never see V2
> > > twice.
> > > > > Do
> > > > > you think we this feature will be interesting for our users?
> > > >
> > > >
> > > > Am I right that this would be a generic solution - whether you use
> Scan
> > > or
> > > > SQL query as an initial one? Have we planned it for the transactional
> > SQL
> > > > GA or it's out of scope for now?
> > > >
> > > > --
> > > > Denis
> > > >
> > > > On Thu, Dec 13, 2018 at 12:40 PM Vladimir Ozerov <
> vozerov@gridgain.com
> > >
> > > > wrote:
> > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html
> > > > >
> > > > > On Thu, Dec 13, 2018 at 11:38 PM Vladimir Ozerov <
> > vozerov@gridgain.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Denis,
> > > > > >
> > > > > > Not really. They are used to ensure that ordering of
> notifications
> > is
> > > > > > consistent with ordering of updates, so that when a key K is
> > updated
> > > to
> > > > > V1,
> > > > > > then V2, then V3, you never observe V1 -> V3 -> V2. It
also
> solves
> > > > > > duplicate notification problem in case of node failures, when
the
> > > same
> > > > > > update is delivered twice.
> > > > > >
> > > > > > However, partition counters are unable to solve duplicates
> problem
> > in
> > > > > > general. Essentially, the question is how to get consistent
view
> on
> > > > some
> > > > > > data plus all notifications which happened afterwards. There
are
> > only
> > > > two
> > > > > > ways to achieve this - either lock entries during initial query,
> or
> > > > take
> > > > > a
> > > > > > kind of consistent data snapshot. The former was never
> implemented
> > in
> > > > > > Ignite - our Scan and SQL queries do not user locking. The latter
> > is
> > > > > > achievable in theory with MVCC. I raised that question earlier
> [1]
> > > (see
> > > > > > p.2), and we came to conclusion that it might be a good feature
> for
> > > the
> > > > > > product. It is not implemented that way for MVCC now, but most
> > > probably
> > > > > is
> > > > > > not extraordinary difficult to implement.
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html#a33998
> > > > > >
> > > > > > On Thu, Dec 13, 2018 at 11:17 PM Denis Magda <dmagda@apache.org>
> > > > wrote:
> > > > > >
> > > > > >> Vladimir,
> > > > > >>
> > > > > >> The partition counter is supposed to be used internally
to solve
> > the
> > > > > >> duplication issue. Does it sound like a right approach then?
> > > > > >>
> > > > > >> What would be an approach for SQL queries? Not sure the
> partition
> > > > > counter
> > > > > >> is applicable.
> > > > > >>
> > > > > >> --
> > > > > >> Denis
> > > > > >>
> > > > > >> On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov <
> > > > vozerov@gridgain.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Partition counter is internal implemenattion detail,
which has
> > no
> > > > > >> sensible
> > > > > >> > meaning to end users. It should not be exposed through
public
> > API.
> > > > > >> >
> > > > > >> > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <
> dmagda@apache.org
> > >
> > > > > wrote:
> > > > > >> >
> > > > > >> > > Hello Piotr,
> > > > > >> > >
> > > > > >> > > That's a known problem and I thought a JIRA ticket
already
> > > exists.
> > > > > >> > However,
> > > > > >> > > failed to locate it. The ticket for the improvement
should
> be
> > > > > created
> > > > > >> as
> > > > > >> > a
> > > > > >> > > result of this conversation.
> > > > > >> > >
> > > > > >> > > Speaking of an initial query type, I would differentiate
> from
> > > > > >> ScanQueries
> > > > > >> > > and SqlQueries. For the former, it sounds reasonable
to
> apply
> > > the
> > > > > >> > > partitionCounter logic. As for the latter, Vladimir
Ozerov
> > will
> > > it
> > > > > be
> > > > > >> > > addressed as part of MVCC/Transactional SQL activities?
> > > > > >> > >
> > > > > >> > > Btw, Piotr what's your initial query type?
> > > > > >> > >
> > > > > >> > > --
> > > > > >> > > Denis
> > > > > >> > >
> > > > > >> > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański
<
> > > > > >> piotr.romanski@gmail.com
> > > > > >> > >
> > > > > >> > > wrote:
> > > > > >> > >
> > > > > >> > > > Hi, as suggested by Ilya here:
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html
> > > > > >> > > > I'm resending it to the developers list.
> > > > > >> > > >
> > > > > >> > > > From that thread we know that there might
be duplicates
> > > between
> > > > > >> initial
> > > > > >> > > > query results and listener entries received
as part of
> > > > continuous
> > > > > >> > query.
> > > > > >> > > > That means that users need to manually dedupe
data.
> > > > > >> > > >
> > > > > >> > > > In my opinion the manual deduplication in
some use cases
> may
> > > > lead
> > > > > to
> > > > > >> > > > possible memory problems on the client side.
In order to
> > > remove
> > > > > >> > > duplicated
> > > > > >> > > > notifications which we are receiving in the
local
> listener,
> > we
> > > > > need
> > > > > >> to
> > > > > >> > > keep
> > > > > >> > > > all initial query results in memory (or at
least their
> > unique
> > > > > ids).
> > > > > >> > > > Unfortunately, there is no way (is there?)
to find a point
> > in
> > > > time
> > > > > >> when
> > > > > >> > > we
> > > > > >> > > > can be sure that no dups will arrive anymore.
That would
> > mean
> > > > that
> > > > > >> we
> > > > > >> > > need
> > > > > >> > > > to keep that data indefinitely and use it
every time a new
> > > > > >> notification
> > > > > >> > > > arrives. In case of multiple continuous queries
run from a
> > > > single
> > > > > >> JVM,
> > > > > >> > > this
> > > > > >> > > > might eventually become a memory or performance
problem. I
> > can
> > > > see
> > > > > >> the
> > > > > >> > > > following possible improvements to Ignite:
> > > > > >> > > >
> > > > > >> > > > 1. The deduplication between initial query
and incoming
> > > > > notification
> > > > > >> > > could
> > > > > >> > > > be done fully in Ignite. As far as I know
there is already
> > the
> > > > > >> > > > updateCounter and partition id for all the
objects so it
> > could
> > > > be
> > > > > >> used
> > > > > >> > > > internally.
> > > > > >> > > >
> > > > > >> > > > 2. Add a guarantee that notifications arriving
in the
> local
> > > > > listener
> > > > > >> > > after
> > > > > >> > > > query() method returns are not duplicates.
This kind of
> > > > > >> functionality
> > > > > >> > > would
> > > > > >> > > > require a specific synchronization inside
Ignite. It would
> > > also
> > > > > mean
> > > > > >> > that
> > > > > >> > > > the query() method cannot return before all
potential
> > > duplicates
> > > > > are
> > > > > >> > > > processed by a local listener what looks
wrong.
> > > > > >> > > >
> > > > > >> > > > 3. Notify users that starting from a given
notification
> they
> > > can
> > > > > be
> > > > > >> > sure
> > > > > >> > > > they will not receive any duplicates anymore.
This could
> be
> > an
> > > > > >> > additional
> > > > > >> > > > boolean flag in the CacheQueryEntryEvent.
> > > > > >> > > >
> > > > > >> > > > 4. CacheQueryEntryEvent already exposes the
> > > > > partitionUpdateCounter.
> > > > > >> > > > Unfortunately we don't have this information
for initial
> > query
> > > > > >> results.
> > > > > >> > > If
> > > > > >> > > > we had, a client could manually deduplicate
notifications
> > and
> > > > get
> > > > > >> rid
> > > > > >> > of
> > > > > >> > > > initial query results for a given partition
after newer
> > > > > >> notifications
> > > > > >> > > > arrive. Also it would be very convenient
to expose
> partition
> > > id
> > > > as
> > > > > >> well
> > > > > >> > > but
> > > > > >> > > > now we can figure it out using the affinity
service. The
> > > > > assumption
> > > > > >> > here
> > > > > >> > > is
> > > > > >> > > > that notifications are ordered by partitionUpdateCounter
> (is
> > > it
> > > > > >> true?).
> > > > > >> > > >
> > > > > >> > > > Please correct me if I'm missing anything.
> > > > > >> > > >
> > > > > >> > > > What do you think?
> > > > > >> > > >
> > > > > >> > > > Piotr
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message