ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Mekhanikov <dmekhani...@gmail.com>
Subject Re: Service grid redesign
Date Fri, 06 Apr 2018 15:28:51 GMT
Val,

I don't really like the idea of automatic redeployment of services when
classes change.
Different nodes may detect these changes at different moments in time, so
there won't be any guarantee, that all nodes have the same version.
And if redeployment fails, then there won't be a way to notify user code
about it.
Also service fields may change between versions, so already deployed
services won't be able to be deserialized, using new classes.

I think, it would be better if user could trigger redeployment manually. It
would solve the mentioned problems and let the user redeploy services, even
when only their field parameters change without implementation changes.

What do you think?

Denis

чт, 5 апр. 2018 г. в 22:37, Denis Magda <dmagda@apache.org>:

> Val,
>
> Sounds like a great solution. I'm totally for it.
>
> --
> Denis
>
> On Thu, Apr 5, 2018 at 12:32 PM, Valentin Kulichenko <
> valentin.kulichenko@gmail.com> wrote:
>
> > Denis,
> >
> > This is why I'm suggesting to use DeploymentSpi for this. The way I see
> > this is that instead of deploying classes on local classpath, user can
> > deploy them in the storage that SPI points to. If class is updated in the
> > storage, Ignite detects this and automatically restarts the service. This
> > is a very simple and straightforward approach that doesn't required a lot
> > of changes on our side and allows to reuse existing implementation of
> > DeploymentSpi.
> >
> > -Val
> >
> > On Thu, Apr 5, 2018 at 12:13 PM, Denis Magda <dmagda@gridgain.com>
> wrote:
> >
> > > >
> > > > There is no need to deserialize services on the coordinator. It
> should
> > > only
> > > > be able to calculate the assignments.
> > > > *LazyServiceConfiguration *should be used to deliver the service
> > > > configurations, just like it is done right now.
> > >
> > >
> > > Can that configuration be tweaked over the time requiring to update the
> > > class on all the nodes (if, for instance, someone wants to deploy the
> > next
> > > version of a service)? Just want to be sure we don't need to restart
> the
> > > cluster nodes (that won't be used for service deployments) on
> > > services-related configurational changes.
> > >
> > > --
> > > Denis
> > >
> > > On Thu, Apr 5, 2018 at 8:18 AM, Denis Mekhanikov <
> dmekhanikov@gmail.com>
> > > wrote:
> > >
> > > > Denis,
> > > > There is no need to deserialize services on the coordinator. It
> should
> > > only
> > > > be able to calculate the assignments.
> > > > *LazyServiceConfiguration *should be used to deliver the service
> > > > configurations, just like it is done right now.
> > > >
> > > > Val,
> > > > Usage of DeploymentSpi is a good idea, I didn't think about this
> > > > possibility.
> > > > This is a viable alternative to peer-class-loading, not that
> > > user-friendly
> > > > though.
> > > > But if peer-class-loading is that hard to implement, then I vote for
> > > > DeploymentSpi.
> > > > As far as I understand, it won't require us to do any additional
> > changes
> > > in
> > > > Ignite, but will make users think about using a proper DeploymentSpi.
> > > > Please correct me, if I'm wrong.
> > > > It would be good, though, to add some examples on service
> redeployment,
> > > > when implementation class changes.
> > > >
> > > > Denis
> > > >
> > > > чт, 5 апр. 2018 г. в 2:33, Valentin Kulichenko <
> > > > valentin.kulichenko@gmail.com>:
> > > >
> > > > > I don't think peer class loading is even possible for services. I
> > > believe
> > > > > we should reuse DeploymentSpi [1] for versioning.
> > > > >
> > > > > [1] https://apacheignite.readme.io/docs/deployment-spi
> > > > >
> > > > > -Val
> > > > >
> > > > > On Wed, Apr 4, 2018 at 12:52 PM, Denis Magda <dmagda@gridgain.com>
> > > > wrote:
> > > > >
> > > > > > Sorry, that was me who renamed the IEP to "Oil Change in Service
> > > Grid".
> > > > > Was
> > > > > > writing this email after the renaming. Like that title more
> because
> > > > it's
> > > > > > fun and highlights what we're intended to do - cleaning of our
> > > service
> > > > > grid
> > > > > > engine and powering it up with new "liquid" (new communication
> and
> > > > > > deployment approach not available before).
> > > > > >
> > > > > > Denis
> > > > > >
> > > > > >
> > > > > > > This message contains serialized service instance and its
> > > > > configuration.
> > > > > > > It is delivered to the coordinator node first, that calculates
> > the
> > > > > > service
> > > > > > > deployment assignments and adds this information to the
> message.
> > > > > >
> > > > > >
> > > > > > I would consider using a NodeFilter first to decide where a
> service
> > > can
> > > > > be
> > > > > > potentially deployed.  Otherwise, we would require service
> classes
> > to
> > > > be
> > > > > on
> > > > > > every node (every node might become a coordinator) which is
not
> the
> > > > > desired
> > > > > > requirement.
> > > > > >
> > > > > >
> > > > > > As for the peer-class-loading, I would backup up Dmitriy here.
> > Let's
> > > at
> > > > > > least not to focus on this task for now. We should design
> services
> > > > > > versioning in the right way first and support it.
> > > > > >
> > > > > > --
> > > > > > Denis
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Apr 4, 2018 at 12:20 PM, Dmitriy Setrakyan <
> > > > > dsetrakyan@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Here is the correct link:
> > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-
> > > > > > > 17%3A+Oil+Change+in+Service+Grid
> > > > > > >
> > > > > > > I have looked at the tickets there, and I believe that
we
> should
> > > not
> > > > > > > support peer-deployment for services. It is very hard and
I do
> > not
> > > > > think
> > > > > > we
> > > > > > > should even try.
> > > > > > >
> > > > > > > I am proposing closing this ticket as Won't Fix -
> > > > > > > https://issues.apache.org/jira/browse/IGNITE-975
> > > > > > >
> > > > > > > D.
> > > > > > >
> > > > > > > On Wed, Apr 4, 2018 at 5:39 AM, Denis Mekhanikov <
> > > > > dmekhanikov@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Vyacheslav,
> > > > > > > >
> > > > > > > > I've just posted my first draft of the IEP:
> > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-
> > > > > > > 17%3A+Service+grid+
> > > > > > > > improvements
> > > > > > > > It's not finished yet, but you can get the idea from
it.
> > > > > > > > If you have some thoughts on your mind, please let
me know,
> > I'll
> > > > add
> > > > > > them
> > > > > > > > to the IEP.
> > > > > > > >
> > > > > > > > Denis
> > > > > > > >
> > > > > > > > ср, 4 апр. 2018 г. в 13:09, Vyacheslav Daradur
<
> > > > daradurvs@gmail.com
> > > > > >:
> > > > > > > >
> > > > > > > > > Denis, thanks for the link.
> > > > > > > > >
> > > > > > > > > I looked through the task and I think that understand
your
> > > > redesign
> > > > > > > point
> > > > > > > > > now.
> > > > > > > > >
> > > > > > > > > Do you have a clear plan or IEP for the whole
redesign?
> > > > > > > > >
> > > > > > > > > I'm interested in this component and I'd like
to take part
> in
> > > the
> > > > > > > > > development.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Apr 2, 2018 at 2:55 PM, Denis Mekhanikov
<
> > > > > > > dmekhanikov@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > Vyacheslav,
> > > > > > > > > >
> > > > > > > > > > Service deployment design, based on replicated
utility
> > cache
> > > > has
> > > > > > > proven
> > > > > > > > > to
> > > > > > > > > > be unstable and deadlock-prone.
> > > > > > > > > > You can find a list of JIRA issues, connected
to it, in
> my
> > > > > previous
> > > > > > > > > letter.
> > > > > > > > > >
> > > > > > > > > > The intention behind it is similar to the
binary metadata
> > > > > redesign,
> > > > > > > > that
> > > > > > > > > > happened in the following ticket: IGNITE-4157
> > > > > > > > > > <https://issues.apache.org/jira/browse/IGNITE-4157>
> > > > > > > > > > This change in service deployment procedure
will
> eliminate
> > > need
> > > > > for
> > > > > > > > > another
> > > > > > > > > > internal replicated cache
> > > > > > > > > > and make service deployment more reliable
on unstable
> > > topology.
> > > > > > > > > >
> > > > > > > > > > Denis
> > > > > > > > > >
> > > > > > > > > > вт, 27 мар. 2018 г. в 23:21, Vyacheslav
Daradur <
> > > > > > daradurvs@gmail.com
> > > > > > > >:
> > > > > > > > > >
> > > > > > > > > >> Hi, Denis Mekhanikov!
> > > > > > > > > >>
> > > > > > > > > >> As far as I know, Ignite services are
based on
> IgniteCache
> > > and
> > > > > we
> > > > > > > have
> > > > > > > > > >> all its features. We can use listeners
or continuous
> > queries
> > > > for
> > > > > > > > > >> deployment synchronizations.
> > > > > > > > > >>
> > > > > > > > > >> Why do you want using the discovery
layer for that?
> > > > > > > > > >>
> > > > > > > > > >> One more thing: we can use baseline
approach for
> services,
> > > > that
> > > > > > > means
> > > > > > > > > >> *IgniteService.deploy()* returns ready
to work service
> > after
> > > > > > > > > >> deployment on baseline nodes and deploy
to other nodes
> on
> > > > > demand,
> > > > > > > for
> > > > > > > > > >> example when deployed service's loading
will be hight.
> > > > > > > > > >>
> > > > > > > > > >> About versioning, maybe there is sense
to extend public
> > API:
> > > > > > > > > >> IgniteServices.service(name, *version*)?
> > > > > > > > > >>
> > > > > > > > > >> At first deployment, we can compute
service's hashcode
> > (just
> > > > for
> > > > > > an
> > > > > > > > > >> example) and store it, after new deployment
request for
> > > > services
> > > > > > > with
> > > > > > > > > >> an existing name we will compute new
service's hashcode
> > and
> > > > > > compare
> > > > > > > > > >> them if they have different hashcodes
that we will
> deploy
> > > new
> > > > > > > service
> > > > > > > > > >> as service with a different version.
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> On Fri, Mar 23, 2018 at 10:03 PM, Denis
Magda <
> > > > > dmagda@apache.org>
> > > > > > > > > wrote:
> > > > > > > > > >> > Denis,
> > > > > > > > > >> >
> > > > > > > > > >> > Thanks for the extensive analysis.
There is a vast
> room
> > > for
> > > > > > > > > optimizations
> > > > > > > > > >> > on the service grid side.
> > > > > > > > > >> >
> > > > > > > > > >> > Yakov, Sam, Alex G.,
> > > > > > > > > >> >
> > > > > > > > > >> > How do you like the idea of the
usage of discovery
> > > protocol
> > > > > for
> > > > > > > the
> > > > > > > > > >> service
> > > > > > > > > >> > grid system messages exchange?
Any pitfalls?
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >> > --
> > > > > > > > > >> > Denis
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >> > On Fri, Mar 23, 2018 at 8:01 AM,
Denis Mekhanikov <
> > > > > > > > > dmekhanikov@gmail.com
> > > > > > > > > >> >
> > > > > > > > > >> > wrote:
> > > > > > > > > >> >
> > > > > > > > > >> >> Igniters,
> > > > > > > > > >> >>
> > > > > > > > > >> >> I'd like to start a discussion
on Ignite service grid
> > > > > redesign.
> > > > > > > > > >> >> We have a number of problems
in our current
> > architecture,
> > > > > that
> > > > > > > have
> > > > > > > > > to
> > > > > > > > > >> be
> > > > > > > > > >> >> addressed.
> > > > > > > > > >> >>
> > > > > > > > > >> >> Here are the most severe ones:
> > > > > > > > > >> >>
> > > > > > > > > >> >> One of them is lack of guarantee,
that service is
> > > > > successfully
> > > > > > > > > deployed
> > > > > > > > > >> and
> > > > > > > > > >> >> ready for work by the time,
when
> > > *IgniteService.deploy*()*
> > > > > > > methods
> > > > > > > > > >> return.
> > > > > > > > > >> >> Furthermore, if an exception
is thrown from
> > > *Service.init()
> > > > > > > > *method,
> > > > > > > > > >> then
> > > > > > > > > >> >> the deploying side is not able
to receive it, or even
> > > > > > understand,
> > > > > > > > > that
> > > > > > > > > >> >> service is in unusable state.
> > > > > > > > > >> >> So, you may end up in such
situation, when you
> > deployed a
> > > > > > service
> > > > > > > > > >> without
> > > > > > > > > >> >> receiving any errors, then
called a service's method,
> > and
> > > > > hung
> > > > > > > > > >> indefinitely
> > > > > > > > > >> >> on this invocation.
> > > > > > > > > >> >> JIRA ticket:
> > > > > https://issues.apache.org/jira/browse/IGNITE-3392
> > > > > > > > > >> >>
> > > > > > > > > >> >> Another problem is locking
during service deployment
> on
> > > > > > unstable
> > > > > > > > > >> topology.
> > > > > > > > > >> >> This issue is caused by missing
updates in continuous
> > > query
> > > > > > > > > listeners on
> > > > > > > > > >> >> the internal cache.
> > > > > > > > > >> >> It is hard to reproduce, but
it happens sometimes. We
> > > > > shouldn't
> > > > > > > > allow
> > > > > > > > > >> such
> > > > > > > > > >> >> possibility, that deployment
methods hang without
> > saying
> > > > > > > anything.
> > > > > > > > > >> >> JIRA ticket:
> > > > > https://issues.apache.org/jira/browse/IGNITE-6259
> > > > > > > > > >> >>
> > > > > > > > > >> >> I think, we should change the
deployment procedure to
> > > make
> > > > it
> > > > > > > more
> > > > > > > > > >> >> reliable.
> > > > > > > > > >> >> Moving from operating over
internal replicated
> service
> > > > cache
> > > > > to
> > > > > > > > > sending
> > > > > > > > > >> >> custom discovery events seems
to be a good idea.
> > > > > > > > > >> >> Service deployment may trigger
a discovery event,
> that
> > > will
> > > > > > make
> > > > > > > > > chosen
> > > > > > > > > >> >> nodes deploy the service, and
the same event will
> > notify
> > > > > other
> > > > > > > > nodes
> > > > > > > > > >> about
> > > > > > > > > >> >> the deployed service instances.
> > > > > > > > > >> >> It will eliminate the need
for distributed
> transactions
> > > on
> > > > > the
> > > > > > > > > internal
> > > > > > > > > >> >> replicated system cache, and
make the service
> > deployment
> > > > > > protocol
> > > > > > > > > more
> > > > > > > > > >> >> transparent.
> > > > > > > > > >> >>
> > > > > > > > > >> >> There are a few points, that
should be taken into
> > account
> > > > > > though.
> > > > > > > > > >> >>
> > > > > > > > > >> >> First of all, we can't wait
for services to be
> deployed
> > > and
> > > > > > > > > initialised
> > > > > > > > > >> in
> > > > > > > > > >> >> the discovery thread.
> > > > > > > > > >> >> So, we need to make notification
about service
> > deployment
> > > > > > result
> > > > > > > > > >> >> asynchronous, presumably over
communication protocol.
> > > > > > > > > >> >> I can think of a procedure
similar to the current
> > > exchange
> > > > > > > > protocol,
> > > > > > > > > >> when
> > > > > > > > > >> >> service deployment is initialised
with an initial
> > > discovery
> > > > > > > > message,
> > > > > > > > > >> >> followed by asynchronous notifications
from the
> hosting
> > > > > servers
> > > > > > > > over
> > > > > > > > > >> >> communication. And finally,
one more discovery
> message
> > > will
> > > > > > > notify
> > > > > > > > > all
> > > > > > > > > >> >> nodes about the service deployment
result and
> location
> > of
> > > > the
> > > > > > > > > deployed
> > > > > > > > > >> >> service instances. Coordinator
will be responsible
> for
> > > > > > collecting
> > > > > > > > of
> > > > > > > > > the
> > > > > > > > > >> >> deployment results in this
scheme.
> > > > > > > > > >> >>
> > > > > > > > > >> >> Another problem is failover
in case, when some nodes
> > fail
> > > > > > during
> > > > > > > > > >> deployment
> > > > > > > > > >> >> or further work.
> > > > > > > > > >> >> The following cases should
be handled:
> > > > > > > > > >> >>
> > > > > > > > > >> >>    1. coordinator failure during
deployment;
> > > > > > > > > >> >>    2. failure of nodes, that
were chosen to host the
> > > > service,
> > > > > > > > during
> > > > > > > > > >> >>    deployment;
> > > > > > > > > >> >>    3. failure of nodes, that
contain deployed
> services,
> > > > after
> > > > > > the
> > > > > > > > > >> >>    deployment.
> > > > > > > > > >> >>
> > > > > > > > > >> >> The first case may be resolved
by either continuation
> > of
> > > > > > > deployment
> > > > > > > > > >> with a
> > > > > > > > > >> >> new coordinator, or by cancelling
it.
> > > > > > > > > >> >> The second case will require
another node to be
> chosen
> > > and
> > > > > > > > notified.
> > > > > > > > > >> Maybe
> > > > > > > > > >> >> another discovery message will
be needed.
> > > > > > > > > >> >> The third case will require
redeployment, so
> > coordinator
> > > > > should
> > > > > > > > track
> > > > > > > > > >> >> topology changes and redeploy
failed services.
> > > > > > > > > >> >>
> > > > > > > > > >> >> Another good improvement would
be service versioning.
> > > This
> > > > > > matter
> > > > > > > > was
> > > > > > > > > >> >> already discussed in another
thread:
> > > > > > > > > >> >>
> > > > > > > > > >>
> > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.
> > > > > > > > com/Service-versioning-
> > > > > > > > > >> >> td20858.html
> > > > > > > > > >> >> Let's resume this discussion
and state the final
> > decision
> > > > > here.
> > > > > > > > > >> >> This feature is closely connected
to peer class
> > loading,
> > > > > which
> > > > > > is
> > > > > > > > not
> > > > > > > > > >> >> working for services currently.
> > > > > > > > > >> >> So, service versioning should
be implemented along
> with
> > > > peer
> > > > > > > class
> > > > > > > > > >> loading.
> > > > > > > > > >> >> JIRA ticket for versioning:
> > > > > > > > > >> >> https://issues.apache.org/jira/browse/IGNITE-6069
> > > > > > > > > >> >> Peer class loading: https://issues.apache.org/
> > > > > > > > jira/browse/IGNITE-975
> > > > > > > > > >> >>
> > > > > > > > > >> >> Please share your thoughts.
Constructive criticism is
> > > > highly
> > > > > > > > > >> appreciated.
> > > > > > > > > >> >>
> > > > > > > > > >> >> Denis
> > > > > > > > > >> >>
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> --
> > > > > > > > > >> Best Regards, Vyacheslav D.
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best Regards, Vyacheslav D.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message