ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Goncharuk <alexey.goncha...@gmail.com>
Subject Re: IEP-61 Technical discussion
Date Wed, 25 Nov 2020 13:42:14 GMT
Folks,

I've made some edits in IEP-61 [1] regarding the group membership service
and transaction protocol interaction with the replication infrastructure,
please take a look before our Friday call.

[1]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-61%3A+Common+Replication+Infrastructure

пн, 23 нояб. 2020 г. в 13:28, Alexey Goncharuk <alexey.goncharuk@gmail.com>:

> Thanks, Ivan,
>
> Another protocol for group membership worth checking out is RAPID [1] (a
> recent one). Not sure though if there are any available implementations for
> it already.
>
> [1] https://www.usenix.org/system/files/conference/atc18/atc18-suresh.pdf
>
> пн, 23 нояб. 2020 г. в 10:46, Ivan Daschinsky <ivandasch@gmail.com>:
>
>> Also, here is some interesting reading about gossip, SWIM etc.
>>
>> 1 --
>> http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf
>> 2 --
>>
>> http://www.antonkharenko.com/2015/09/swim-distributed-group-membership.html
>> 3 -- https://github.com/hashicorp/memberlist (Foundation library of
>> hashicorp serf)
>> 4 -- https://github.com/scalecube/scalecube-cluster -- (Java
>> implementation
>> of SWIM)
>>
>> чт, 19 нояб. 2020 г. в 16:35, Ivan Daschinsky <ivandasch@gmail.com>:
>>
>> > >> Friday, Nov 27th work for you? If ok, let's have an open call then.
>> > Yes, great
>> > >> As for the protocol port - we will not be dealing with the
>> > concurrency...
>> > >>Judging by the Rust port, it seems fairly straightforward.
>> > Yes, they chose split transport and logic. But original Go package from
>> > etcd (see raft/node.go) contains some  heartbeats mechanism etc.
>> > I agree with you, this seems not to be a huge deal to port.
>> >
>> > чт, 19 нояб. 2020 г. в 16:13, Alexey Goncharuk <
>> alexey.goncharuk@gmail.com
>> > >:
>> >
>> >> Ivan,
>> >>
>> >> Agree, let's have a call to discuss the IEP. I have some more thoughts
>> >> regarding how the replication infrastructure works with
>> >> atomic/transactional caches, will put this info to the IEP. Does next
>> >> Friday, Nov 27th work for you? If ok, let's have an open call then.
>> >>
>> >> As for the protocol port - we will not be dealing with the concurrency
>> >> model if we choose this way, this is what I like about their code
>> >> structure. Essentially, the raft module is a single-threaded automata
>> >> which
>> >> has a callback to process a message, process a tick (timeout) and
>> produces
>> >> messages that should be sent and log entries that should be persisted.
>> >> Judging by the Rust port, it seems fairly straightforward. Will be
>> happy
>> >> to
>> >> discuss this and other alternatives on the call as well.
>> >>
>> >> чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <ivandasch@gmail.com>:
>> >>
>> >> > > Any existing library that can be used to avoid re-implementing
the
>> >> > protocol ourselves? Perhaps, porting the existing implementation to
>> Java
>> >> > Personally, I like this idea. Go libraries (either raft module of
>> etcd
>> >> or
>> >> > serf by Hashicorp) are famous for clean code, good design, stability,
>> >> not
>> >> > enormous size.
>> >> > But, on other side, Go has different model for concurrency and
>> porting
>> >> > probably will not be so straightforward.
>> >> >
>> >> >
>> >> >
>> >> > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <ivandasch@gmail.com>:
>> >> >
>> >> > > I'd suggest to discuss this IEP and technical details in open
ZOOM
>> >> > > meeting.
>> >> > >
>> >> > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <ivandasch@gmail.com
>> >:
>> >> > >
>> >> > >>
>> >> > >>
>> >> > >> ---------- Forwarded message ---------
>> >> > >> От: Ivan Daschinsky <ivandasch@gmail.com>
>> >> > >> Date: чт, 19 нояб. 2020 г. в 13:02
>> >> > >> Subject: Re: IEP-61 Technical discussion
>> >> > >> To: Alexey Goncharuk <alexey.goncharuk@gmail.com>
>> >> > >>
>> >> > >>
>> >> > >> Alexey, let's arise another question. Specifically, how nodes
>> >> initially
>> >> > >> find each other (discovery) and how they detect failures.
>> >> > >>
>> >> > >> I suppose, that gossip protocol is an ideal candidate. For
>> example,
>> >> > >> consul [1] uses this approach, using serf [2] library to discover
>> >> > members
>> >> > >> of cluster.
>> >> > >> Then consul forms raft ensemble (server nodes) and client
use raft
>> >> > >> ensemble only as lock service.
>> >> > >>
>> >> > >> PacificA suggests internal heartbeats mechanism for failure
>> >> detection of
>> >> > >> replicated group, but it says nothing about initial discovery
of
>> >> nodes.
>> >> > >>
>> >> > >> WDYT?
>> >> > >>
>> >> > >> [1] -- https://www.consul.io/docs/architecture/gossip
>> >> > >> [2] -- https://www.serf.io/
>> >> > >>
>> >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <
>> >> > >> alexey.goncharuk@gmail.com>:
>> >> > >>
>> >> > >>> Following up the Ignite 3.0 scope/development approach
threads,
>> >> this is
>> >> > >>> a separate thread to discuss technical aspects of the
IEP.
>> >> > >>>
>> >> > >>> Let's reiterate one more time on the questions raised
by Ivan and
>> >> also
>> >> > >>> see if there are any other thoughts on the IEP:
>> >> > >>>
>> >> > >>>    - *Whether to deploy metastorage on a separate subset
of the
>> >> nodes
>> >> > >>>    or allow Ignite to choose these nodes automatically.*
I think
>> it
>> >> is
>> >> > >>>    feasible to maintain both modes: by default, Ignite
will
>> choose
>> >> > >>>    metastorage nodes automatically which essentially will
provide
>> >> the
>> >> > same
>> >> > >>>    seamless user experience as TCP discovery SPI - no
separate
>> >> roles,
>> >> > >>>    simplistic deployment. For deployments where people
want to
>> have
>> >> > more
>> >> > >>>    fine-grained control over the nodes' assignments, we
will
>> >> provide a
>> >> > runtime
>> >> > >>>    configuration which will allow pinning metastorage
group to
>> >> certain
>> >> > nodes,
>> >> > >>>    thus eliminating the latency concerns.
>> >> > >>>    - *Whether there are any TLA+ specs for the PacificA
>> protocol.*
>> >> Not
>> >> > >>>    to my knowledge, but it is known to be used in production
by
>> >> > Microsoft and
>> >> > >>>    other projects, e.g. [1]
>> >> > >>>
>> >> > >>> I would like to collect general feedback on the IEP, as
well as
>> >> > feedback
>> >> > >>> on specific parts of it, such as:
>> >> > >>>
>> >> > >>>    - Metastorage API
>> >> > >>>    - Any existing library that can be used to avoid
>> re-implementing
>> >> the
>> >> > >>>    protocol ourselves? Perhaps, porting the existing
>> implementation
>> >> to
>> >> > Java
>> >> > >>>    (the way TiKV did with etcd-raft [2] [3]? This is a
very neat
>> way
>> >> > btw in my
>> >> > >>>    opinion because I like the finite automata-like approach
of
>> the
>> >> > replication
>> >> > >>>    module, and, additionally, we could sync bug fixes
and
>> >> improvements
>> >> > from
>> >> > >>>    the upstream project)
>> >> > >>>
>> >> > >>>
>> >> > >>> Thanks,
>> >> > >>> --AG
>> >> > >>>
>> >> > >>> [1]
>> >> > >>>
>> >> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
>> >> > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft
>> >> > >>> [3] https://github.com/tikv/raft-rs
>> >> > >>>
>> >> > >>
>> >> > >>
>> >> > >> --
>> >> > >> Sincerely yours, Ivan Daschinskiy
>> >> > >>
>> >> > >>
>> >> > >> --
>> >> > >> Sincerely yours, Ivan Daschinskiy
>> >> > >>
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Sincerely yours, Ivan Daschinskiy
>> >> > >
>> >> >
>> >> >
>> >> > --
>> >> > Sincerely yours, Ivan Daschinskiy
>> >> >
>> >>
>> >
>> >
>> > --
>> > Sincerely yours, Ivan Daschinskiy
>> >
>>
>>
>> --
>> Sincerely yours, Ivan Daschinskiy
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message