ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eduard Shangareev <eduard.shangar...@gmail.com>
Subject Re: IEP-4, Phase 2. Using BL(A)T for in-memory caches.
Date Wed, 23 May 2018 12:34:24 GMT
Igniters,

We have invested too much in explaining BLAT. So, it would hard to change
the name.
I.e. I propose to save this term.


New names for auto-adjust control.

1. org.apache.ignite.IgniteCluster

*Add*
isBaselineAutoAdjustEnabled()
setBaselineAutoAdjustEnabled(boolean enabled);
setBaselineAutoAdjustTimeout(long timeoutInMs);
setBaselineAutoAdjustMaxTimeout(long timeoutInMs);

2. org.apache.ignite.configuration.IgniteConfiguration

*Add*
IgniteConfiguration setBaselineAutoAdjustEnabled(boolean enabled);
IgniteConfiguration setBaselineAutoAdjustTimeout(long timeoutInMs);
IgniteConfiguration setBaselineAutoAdjustMaxTimeout(long timeoutInMs);

Any objections?



On Fri, May 4, 2018 at 10:01 PM, Dmitriy Setrakyan <dsetrakyan@apache.org>
wrote:

> I do not like the name "current" on the methods. I think we should just
> remove it, e.g. currentAffinityTopology() -> affinityTopology()
>
> D.
>
> On Fri, May 4, 2018 at 6:17 AM, Eduard Shangareev <
> eduard.shangareev@gmail.com> wrote:
>
> > Igniters,
> >
> > With Vladimir's help, we analyzed another solution's approaches.
> > And decided to simplify our affinity topology auto-adjusting.
> >
> > It should be enough to be able to turn on/off auto-adjusting (flag) and
> set
> > 2 timeouts if it is working:
> > -soft timeout which would be used if there was no other node joins/exits;
> > -hard timeout which we would track from first discovery event and if it
> > reached then immediately would change affinity topology.
> >
> > All other strategies could be realized with API usage
> (setAffinityTopology)
> > and metrics tracking by user's monitoring tools.
> >
> > So, I suggest next API changes:
> >
> > org.apache.ignite.IgniteCluster
> >
> > *Deprecate*:
> > Collection<BaselineNode> currentBaselineTopology();
> > void setBaselineTopology(Collection<? extends BaselineNode>
> baselineTop);
> > void setBaselineTopology(long topVer);
> >
> > *Replace them with*
> > Collection<BaselineNode> currentAffinityTopology();
> > void setAffinityTopology(Collection<? extends BaselineNode>
> affinityTop);
> > void setAffinityTopology(long topVer);
> >
> > *Add*
> > isAffinityTopologyAutoAdjustEnabled()
> > setAffinityTopologyAutoAdjustEnabled(boolean enabled);
> >
> > org.apache.ignite.configuration.IgniteConfiguration
> >
> > *Add*
> > IgniteConfiguration setAffinityTopologyAutoAdjustEnabled(boolean
> enabled);
> > IgniteConfiguration setAffinityTopologyAutoAdjustTimeout(long
> > timeoutInMs);
> > IgniteConfiguration setAffinityTopologyAutoAdjustMaxTimeout(long
> > timeoutInMs);
> >
> >
> > An open question is could we rename or duplicate BaselineNode with
> > AffinityNode?
> >
> >
> >
> >
> >
> >
> > On Fri, Apr 27, 2018 at 6:56 PM, Ivan Rakov <ivan.glukos@gmail.com>
> wrote:
> >
> > > Eduard,
> > >
> > > +1 to your proposed API for configuring Affinity Topology change
> > policies.
> > > Obviously we should use "auto" as default behavior. I believe,
> automatic
> > > rebalancing is expected and more convenient for majority of users.
> > >
> > > Best Regards,
> > > Ivan Rakov
> > >
> > >
> > > On 26.04.2018 19:27, Eduard Shangareev wrote:
> > >
> > >> Igniters,
> > >>
> > >> Ok, I want to share my thoughts about "affinity topology (AT) changing
> > >> policies".
> > >>
> > >>
> > >> There would be three major option:
> > >> -auto;
> > >> -manual;
> > >> -custom.
> > >>
> > >> 1. Automatic change.
> > >> A user could set timeouts for:
> > >> a. change AT on any topology change after some timeout
> > (setATChangeTimeout
> > >> in seconds);
> > >> b. change AT on node left after some timeout
> > (setATChangeOnNodeLeftTimeout
> > >> in seconds);
> > >> c. change AT on node join after some timeout
> > (setATChangeOnNodeJoinTimeout
> > >> in seconds).
> > >>
> > >> b and c are more specific, so they would override a.
> > >>
> > >> Also, I want to introduce a mechanism of merging AT changes, which
> would
> > >> be
> > >> turned on by default.
> > >> Other words, if we reached timeout than we would change AT to current
> > >> topology, not that one which was on timeout schedule.
> > >>
> > >> 2. Manual change.
> > >>
> > >> Current behavior. A user change AT himself by console tools or web
> > >> console.
> > >>
> > >> 3. Custom.
> > >>
> > >> We would give the option to set own realization of changing policy
> > (class
> > >> name in config).
> > >> We should pass as incoming parameters:
> > >> - current topology (collection of cluster nodes);
> > >> - current AT (affinity topology);
> > >> - map of GroupId to minimal alive backup number;
> > >> - list of configuration (1.a, 1.b, 1.c);
> > >> - scheduler.
> > >>
> > >> Plus to these configurations, I propose orthogonal option.
> > >> 4. Emergency affinity topology change.
> > >> It would change AT even MANUAL option is set if at least one cache
> group
> > >> backup factor goes below  or equal chosen one (by default 0).
> > >> So, if we came to situation when after node left there was only
> primary
> > >> partion (without backups) for some cache group we would change AT
> > >> immediately.
> > >>
> > >>
> > >> Thank you for your attention.
> > >>
> > >>
> > >> On Thu, Apr 26, 2018 at 6:57 PM, Eduard Shangareev <
> > >> eduard.shangareev@gmail.com> wrote:
> > >>
> > >> Dmitriy,
> > >>>
> > >>> I also think that we should think about 2.6 as the target.
> > >>>
> > >>>
> > >>> On Thu, Apr 26, 2018 at 3:27 PM, Alexey Goncharuk <
> > >>> alexey.goncharuk@gmail.com> wrote:
> > >>>
> > >>> Dmitriy,
> > >>>>
> > >>>> I doubt we will be able to fit this in 2.5 given that we did not
> even
> > >>>> agree
> > >>>> on the policy interface. Forcing in-memory caches to use baseline
> > >>>> topology
> > >>>> will be an easy technical fix, however, we will need to update
and
> > >>>> probably
> > >>>> fix lots of failover tests, add new ones.
> > >>>>
> > >>>> I think it makes sense to target this change to 2.6.
> > >>>>
> > >>>> 2018-04-25 22:25 GMT+03:00 Ilya Lantukh <ilantukh@gridgain.com>:
> > >>>>
> > >>>> Eduard,
> > >>>>>
> > >>>>> I'm not sure I understand what you mean by "policy". Is it
an
> > interface
> > >>>>> that will have a few default implementations and user will
be able
> to
> > >>>>> create his own one? If so, could you please write an example
of
> such
> > >>>>> interface (how you see it) and how and when it's methods will
be
> > >>>>>
> > >>>> invoked.
> > >>>>
> > >>>>> On Wed, Apr 25, 2018 at 10:10 PM, Eduard Shangareev <
> > >>>>> eduard.shangareev@gmail.com> wrote:
> > >>>>>
> > >>>>> Igniters,
> > >>>>>> I have described the issue with current approach in "New
> definition
> > >>>>>>
> > >>>>> for
> > >>>>
> > >>>>> affinity node (issues with baseline)" topic[1].
> > >>>>>>
> > >>>>>> Now we have 2 different affinity topology (one for in-memory,
> > another
> > >>>>>>
> > >>>>> for
> > >>>>
> > >>>>> persistent caches).
> > >>>>>>
> > >>>>>> It causes problems:
> > >>>>>> - we lose (in general) co-location between different caches;
> > >>>>>> - we can't avoid PME when non-BLAT node joins cluster;
> > >>>>>> - implementation should consider 2 different approaches
to
> affinity
> > >>>>>> calculation.
> > >>>>>>
> > >>>>>> So, I suggest unifying behavior of in-memory and persistent
> caches.
> > >>>>>> They should all use BLAT.
> > >>>>>>
> > >>>>>> Their behaviors were different because we couldn't guarantee
the
> > >>>>>>
> > >>>>> safety
> > >>>>
> > >>>>> of
> > >>>>>
> > >>>>>> in-memory data.
> > >>>>>> It should be fixed by a new mechanism of BLAT changing
policy
> which
> > >>>>>>
> > >>>>> was
> > >>>>
> > >>>>> already discussed there - "Triggering rebalancing on timeout
or
> > >>>>>>
> > >>>>> manually
> > >>>>
> > >>>>> if
> > >>>>>
> > >>>>>> the baseline topology is not reassembled" [2].
> > >>>>>>
> > >>>>>> And we should have a policy by default which similar to
current
> one
> > >>>>>> (add nodes, remove nodes automatically but after some reasonable
> > delay
> > >>>>>> [seconds]).
> > >>>>>>
> > >>>>>> After this change, we could stop using the term 'BLAT',
Basline
> and
> > so
> > >>>>>>
> > >>>>> on.
> > >>>>>
> > >>>>>> Because there would not be an alternative. So, it would
be only
> one
> > >>>>>> possible Affinity Topology.
> > >>>>>>
> > >>>>>>
> > >>>>>> [1]
> > >>>>>> http://apache-ignite-developers.2346864.n4.nabble.
> > >>>>>>
> > >>>>> com/New-definition-for-
> > >>>>>
> > >>>>>> affinity-node-issues-with-baseline-td29868.html
> > >>>>>> [2]
> > >>>>>> http://apache-ignite-developers.2346864.n4.nabble.
> > >>>>>> com/Triggering-rebalancing-on-timeout-or-manually-if-the-
> > >>>>>> baseline-topology-is-not-reassembled-td29299.html#none
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Best regards,
> > >>>>> Ilya
> > >>>>>
> > >>>>>
> > >>>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message