ode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sathwik B P <sathwik...@gmail.com>
Subject Re: Deploying process in the cluster
Date Fri, 05 Jun 2015 13:26:44 GMT
find response inline

On Wed, Jun 3, 2015 at 10:22 PM, sudharma subasinghe <suba.11@cse.mrt.ac.lk>
wrote:

> Hi,
>
> Sorry for the late reply. I was trying to achieve a proper solution.
> Following is my approach.
>
> 1) I elected master node for deploying purpose. So when a master node goes
> down hazelcast will elect the next oldest node for master.
>

    Perfect


> 2) ODEServer can identify whether clustering is enabled or not by getting
> the property value in ode-axis2.properties file. So I introduced new
> property called "ode-axis2.hazelcast.clustering.enabled". If there is no
> clustering enabled server will work as it is. If clustering is enabled,
> cluster will be initialized.
>

    Perfect


> 3) In manual deployment its responsibility is taken by the
> DeploymentPoller. So I give the deployment capability to master node,s
> poller by setting "isDeploymentFromODEFileSystemAllowed()" to true.So
> others will not be able to go into check() method in DeploymentPoller. So
> deployment will be done by only the master node.
>

    Perfect


>
> 4) In DeploymentWebService, I had to consider few cases.If the deploy
> request goes to the master node, it will deploy the process through web
> service.Others pollers will not go into check() method as they are not
> masters. So master can continue without any involvement of others.
>
>
    Perfect


> 5) If the deploy request goes to a slave node, it will do up to file
> creation in the file system.Slave will be stopped at that point. As only
> master poller is checking, master can continue from created files in the
> file system.
>

    DeploymentWebService provides synchronous operations. The status of the
operation should be communicated to the calling client in the same call.
    DeploymentPoller is a backend thread that goes over each and every
directory under the Deployment directory checking for any changes to
deploy.xml in existing processes and deploy newly added processes. This
process is  sequential and time consuming. As the process directories
grows, so does the time taken for execution of the thread.

    Since the request is on a slave node and the processing is done on
master node, how do you check for the completion of the
deployment/undeployment of processes and respond back to the client since
the web service call is a synchronous operation. As DeploymentPoller is
taking a lot of time in processing, your request will time out right.


>
> 6) But there was problem with _deploymentUnits in ProcessStoreImpl. Each
> _deploymentUnits stores only what its server has deployed. So think, that a
> master node goes down another master node appears.But its __deploymentUnits
> does not have dus which has deployed by the earlier master node. Hence it
> will not be able retire earlier version of the process which is deployed by
> previous master. So there will two process which are in "ACTIVE" state


> 7) To avoid this, I add the ODEServer as an Observer to check when a new
> master is electing, then load all the deployment units out of the store. So
> new master node can have all the dus and can retire appropriate version.
> Usually loadAll() is called at the server start-up time. But there is no
> other way to solve this. I tried to use Hazelcast IMap to store all dus
> among all nodes. But it wasn't success as du is not serializable object.
>

> 8) I figured out that we do not need send cluster message to others as all
> the dus' data are persisted to the shared DB. So each node can take the du
> and retrieve necessary data using already implemented methods in Process
> Store.
>
> 9) But there is an another problem.The axis2 service corresponding to a
> deployed process does not appear on all nodes of the cluster. That is
> because each server add du which is deployed by it to the process
> store.That is why I had use loadAll() when masters are changing. How to
> solve this?
>

    I do appriciate your efforts in understanding the implmentation and
changes that need to be done. You are bang on it.
    fireEvent(..) is the method that triggers process activation and
necessary service creation.


    But with this given apporach from steps 6 to 8, ODE cannot have atleast
2 Active servers for Load balancing. You are concentrating on only one
active node that will do deployments and cater to process invocations.
    We should also think about scaling ODE to multiple servers to handle
load.

    What do you think.


> Thank you,
> Sudharma
>
> On 2 June 2015 at 08:51, Sathwik B P <sathwik.bp@gmail.com> wrote:
>
> > Sudharma,
> >
> > Any updates?
> >
> > regards,
> > sathwik
> >
> > On Fri, May 29, 2015 at 5:26 PM, Sathwik B P <sathwik.bp@gmail.com>
> wrote:
> >
> > > Sudharma,
> > >
> > > Can you elaborate on your option 1).
> > >
> > > Response to your option 2).
> > >
> > >     Process Store is the component that handles process metadata,
> > > compilation and deployment in ODE. Integration layers in ODE (Axis2,
> JBI)
> > > use the process store.
> > >     Future implementations of IL for ODE will also use the process
> store.
> > > We should not be thinking of moving the process store functionality to
> > the
> > > integration layers.
> > >
> > >
> > > On Thu, May 28, 2015 at 9:33 PM, sudharma subasinghe <
> > > suba.11@cse.mrt.ac.lk> wrote:
> > >
> > >> Hi,
> > >>
> > >> I understood the problem within dynamic master/slave configuration. In
> > my
> > >> approach, when a deployment request is routed to a slave node there
> will
> > >> not be a deployment. I suggest two options to avoid it.
> > >> 1) Have static master/slave configuration only for deploy process
> > >>
> > > 2) Modify the deployment web service to complie and verify the process
> > and
> > >> then copy it to the deploy folder irrespective of whether its a master
> > or
> > >> slave, then deployment poller should take care of the deployment
> > >>
> > >>
> > >
> > >>
> > >> On 28 May 2015 at 14:43, Sathwik B P <sathwik.bp@gmail.com> wrote:
> > >>
> > >> > Sudharma,
> > >> >
> > >> > We definitely need a master/slave in the hazelcast cluster. This is
> > >> > probably needed for the job migration in the Scheduler to migrate
> the
> > >> jobs
> > >> > associated with a down node. Let hold on this topic for future
> > >> discussion.
> > >> >
> > >> > Going by the explanation where the master/slave nodes have certain
> > >> > predefined tasks to perform is perfectly fine.
> > >> >
> > >> > I have this scenario,
> > >> >
> > >> > I am using HAProxy as my load balancer and configured 3 nodes in the
> > >> > cluster.
> > >> >
> > >> > Node1 - Active
> > >> > Node2 - Active
> > >> > Node3 - Backup
> > >> >
> > >> > Load balancing algorithm: RoundRobin
> > >> >
> > >> > A Backup node (Node3) is one which the load balancer will not route
> > >> > requests to, until one of the Active node i.e either Node1 or Node2
> > has
> > >> > gone down.
> > >> >
> > >> > All these 3 nodes are also part of the hazelcast cluster as well.
> > >> >
> > >> > In the hazelcast cluster, assume Node1 is elected as the
> leader/master
> > >> and
> > >> > Node2,Node3 as slaves.
> > >> >
> > >> > I initiate the deploy operation on the DeploymentWebService which
> the
> > >> load
> > >> > balancer routes it to one of the Active nodes in the cluster, lets
> say
> > >> it's
> > >> > the Node1. Since Node1 is also the master in the hazelcast cluster,
> > >> > deployment is a success.
> > >> >
> > >> > I initiate another deploy operation on the DeploymentWebService
> which
> > >> the
> > >> > load balancer routes it to the next active node which is Node2.
> Since
> > >> Node2
> > >> > is a slave in the Hazelcast cluster, What happens to the deployment?
> > >> >
> > >> > regards,
> > >> > sathwik
> > >> >
> > >> > On Wed, May 27, 2015 at 10:55 PM, sudharma subasinghe <
> > >> > suba.11@cse.mrt.ac.lk
> > >> > > wrote:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > > I will explain my approach as much as possible. The oldest node
in
> > the
> > >> > > hazelcast cluster is elected as the master node. In the failure
of
> > the
> > >> > > master node, next oldest node will be elected as the master node.
> > This
> > >> > > master-slave configuration is just for deployment. When the
> > hazelcast
> > >> > > cluster elected the master node, that node becomes a master node
> for
> > >> > > deploying process. So it will do the deploying artifacts. If
you
> > want
> > >> to
> > >> > > get the idea of electing master node please refer the code which
I
> > >> have
> > >> > > located in the github. (
> > >> > > https://github.com/Subasinghe/ode/tree/ode_clustering)
> > >> > >
> > >> > > I identified separated actions which should be followed by the
> > master
> > >> and
> > >> > > salve nodes.
> > >> > > Actions which are followed by master node only
> > >> > > 1) create deployment unit
> > >> > > 2) set the version nu to deployment unit
> > >> > > 3) compile deployment unit
> > >> > > 4) scan deployment unit
> > >> > > 5) retire previous versions
> > >> > > Master node and slave nodes should create _processes which stores
> > >> > > ProcessConfImpl
> > >> > > Only master node will write the version nu to database, create
> > >> .deployed
> > >> > > file
> > >> > >
> > >> > > So there are some actions which should be followed only by master
> > node
> > >> > > while other actions should be followed by all the nodes.The idea
> of
> > >> > having
> > >> > > a master node is deploying artifacts and avoid others from writing
> > the
> > >> > > version nu to database.
> > >> > > Whether a node is active or passive, all nodes should do the
> > >> > > deployment.Master
> > >> > > and slaves will follow necessary actions as in above.
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > On 27 May 2015 at 15:49, Sathwik B P <sathwik.bp@gmail.com>
> wrote:
> > >> > >
> > >> > > > Nandika,
> > >> > > >
> > >> > > > I very well understand what you have put across, but it's
> > secondary
> > >> to
> > >> > me
> > >> > > > now.
> > >> > > >
> > >> > > > Sudharma,
> > >> > > > My primary concern is to understand at a high level the
> deployment
> > >> > > > architecture and how would master-slave configuration fit
in.
> Are
> > >> there
> > >> > > any
> > >> > > > restrictions imposed by the in-progress design?
> > >> > > >
> > >> > > > Firstly, how would ODE process deployment work under these
> cluster
> > >> > > > configurations?
> > >> > > >
> > >> > > > Sample Cluster configurations: A load balancer is frontending
> the
> > >> > > servers.
> > >> > > > 1) Cluster consisting of 2 nodes all Active-Active.
> > >> > > > 2) Cluster consisting of 2 nodes Active-Passive.
> > >> > > > 3) Cluster with 2+ nodes with additional nodes either in
Active
> or
> > >> > > Passive.
> > >> > > >
> > >> > > > regards,
> > >> > > > sathwik
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > On Wed, May 27, 2015 at 3:04 PM, Nandika Jayawardana <
> > >> > jayawark@gmail.com
> > >> > > >
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Hi Sathwik,
> > >> > > > >
> > >> > > > > According to my understanding, in the clustering scenario,
the
> > >> master
> > >> > > > node
> > >> > > > > should perform all the deployment actions and the slave
nodes
> > also
> > >> > need
> > >> > > > to
> > >> > > > > perform some deployment actions. For example, the slave
nodes
> > also
> > >> > > should
> > >> > > > > handle the process ACTIVATED event so that the process
> > >> configuration
> > >> > is
> > >> > > > > added to the engine and necessary web services are
created so
> > that
> > >> > when
> > >> > > > the
> > >> > > > > load balancer send requests to any node in the cluster,
it is
> > >> ready
> > >> > to
> > >> > > > > accept those requests.
> > >> > > > >
> > >> > > > > Regards
> > >> > > > > Nandika
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > On Wed, May 27, 2015 at 12:30 PM, Sathwik B P <
> > >> sathwik.bp@gmail.com>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Sudharma,
> > >> > > > > >
> > >> > > > > > Where are you going to configure the master-slaves,
is it in
> > the
> > >> > web
> > >> > > > > > application level or at the load balancer?
> > >> > > > > >
> > >> > > > > > regards,
> > >> > > > > > sathwik
> > >> > > > > >
> > >> > > > > > On Tue, May 26, 2015 at 7:42 PM, sudharma subasinghe
<
> > >> > > > > > suba.11@cse.mrt.ac.lk>
> > >> > > > > > wrote:
> > >> > > > > >
> > >> > > > > > > Hi Tammo,
> > >> > > > > > >
> > >> > > > > > > Can you suggest the best method from these
to implement?
> As
> > >> > first I
> > >> > > > > > > suggested the master-slaves scenario I think
it is easy to
> > >> > > implement
> > >> > > > > than
> > >> > > > > > > distributed lock scenario. However if you
can suggest one
> > from
> > >> > > these
> > >> > > > > two,
> > >> > > > > > > then I can think about it.
> > >> > > > > > >
> > >> > > > > > > Thank you
> > >> > > > > > >
> > >> > > > > > > On 21 May 2015 at 12:40, Sathwik B P <
> sathwik.bp@gmail.com>
> > >> > wrote:
> > >> > > > > > >
> > >> > > > > > > > With respect to the hotdeployment,
> > >> > > > > > > >
> > >> > > > > > > > We can drop the deployment archive onto
the deployment
> > >> folder.
> > >> > > > Since
> > >> > > > > > the
> > >> > > > > > > > DeploymentPoller are acquiring the distributed
lock for
> > the
> > >> > > > > > > DeploymentUnit,
> > >> > > > > > > > only one of the nodes will get the lock
and initiate the
> > >> > > > deployment.
> > >> > > > > > > > DeploymentPollers on other nodes will
fail in acquiring
> > the
> > >> > lock
> > >> > > > and
> > >> > > > > > > hence
> > >> > > > > > > > will silently ignore it.
> > >> > > > > > > >
> > >> > > > > > > > On Thu, May 21, 2015 at 12:30 PM, Sathwik
B P <
> > >> > > > sathwik.bp@gmail.com>
> > >> > > > > > > > wrote:
> > >> > > > > > > >
> > >> > > > > > > > > Hi Tammo,
> > >> > > > > > > > >
> > >> > > > > > > > > The distributed lock acquisition
on the DeploymentUnit
> > >> should
> > >> > > be
> > >> > > > > > added
> > >> > > > > > > to
> > >> > > > > > > > > both DeploymentWebService and DeploymentPoller.
> > >> > > > > > > > >
> > >> > > > > > > > > When a deployment operation is
initiated through the
> > >> > > > > > > > DeploymentWebService,
> > >> > > > > > > > > The load balancer routes it to
any of the available
> > nodes.
> > >> > > > > > > > >
> > >> > > > > > > > > On the routed node, the DeploymentWebService
acquires
> > the
> > >> > > > > Distributed
> > >> > > > > > > > > lock. On the remaining nodes the
DeploymentPoller will
> > >> try to
> > >> > > > > acquire
> > >> > > > > > > the
> > >> > > > > > > > > distributed lock and will not get
it and hence will
> > >> silently
> > >> > > > ignore
> > >> > > > > > it.
> > >> > > > > > > > >
> > >> > > > > > > > > Once the routed node completes
the deployment, it will
> > >> > release
> > >> > > > the
> > >> > > > > > > lock.
> > >> > > > > > > > > This way we don't have to stall
the DeploymentPoller
> in
> > >> other
> > >> > > > > nodes.
> > >> > > > > > > > >
> > >> > > > > > > > > Does it answer the concerns?
> > >> > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > > > Now, if we give the responsibility
of identifying the
> > >> master
> > >> > > node
> > >> > > > > to
> > >> > > > > > > the
> > >> > > > > > > > > hazelcast, how do we plan to intimate
the load
> balancer
> > to
> > >> > > change
> > >> > > > > > it's
> > >> > > > > > > > > configuration about the master
node?
> > >> > > > > > > > > Assuming there are 3 nodes in the
cluster,
> > >> > > > > > > > > node1 -master
> > >> > > > > > > > > node2 - slave
> > >> > > > > > > > > node3 - slave
> > >> > > > > > > > >
> > >> > > > > > > > > Node1 goes down, the LB will promote
Node2 as master
> > node,
> > >> > but
> > >> > > > > > > hazelcast
> > >> > > > > > > > > might promote Node3 as master node.
They are out of
> > sync.
> > >> > > > > > > > >
> > >> > > > > > > > > Is this argument valid?
> > >> > > > > > > > >
> > >> > > > > > > > > regards,
> > >> > > > > > > > > sathwik
> > >> > > > > > > > >
> > >> > > > > > > > > On Wed, May 20, 2015 at 1:51 PM,
Tammo van Lessen <
> > >> > > > > > > tvanlessen@gmail.com>
> > >> > > > > > > > > wrote:
> > >> > > > > > > > >
> > >> > > > > > > > >> Hi Sudharma,
> > >> > > > > > > > >>
> > >> > > > > > > > >> what do you expect from the
"other nodes deployment"?
> > >> > > > Compilation
> > >> > > > > is
> > >> > > > > > > not
> > >> > > > > > > > >> needed since the CBP file is
written to the (shared)
> > FS.
> > >> > > > > > Registration
> > >> > > > > > > is
> > >> > > > > > > > >> also not needed, since it is
done via the shared
> > >> database.
> > >> > So
> > >> > > > the
> > >> > > > > > only
> > >> > > > > > > > >> thing that might be needed
is to tell the engine that
> > >> there
> > >> > > is a
> > >> > > > > new
> > >> > > > > > > > >> deployment. I'd need to check
that. If this is
> needed,
> > I
> > >> > > revert
> > >> > > > my
> > >> > > > > > > last
> > >> > > > > > > > >> statement, then it is perhaps
better to just send an
> > >> event
> > >> > > over
> > >> > > > > > > > Hazelcast
> > >> > > > > > > > >> to all nodes that the deployment
has changed.
> > >> > > > > > > > >>
> > >> > > > > > > > >> Best,
> > >> > > > > > > > >>   Tammo
> > >> > > > > > > > >>
> > >> > > > > > > > >> On Wed, May 20, 2015 at 10:13
AM, sudharma
> subasinghe <
> > >> > > > > > > > >> suba.11@cse.mrt.ac.lk
> > >> > > > > > > > >> > wrote:
> > >> > > > > > > > >>
> > >> > > > > > > > >> > Hi Tammo,
> > >> > > > > > > > >> >
> > >> > > > > > > > >> > The master node writes
meta data. But runtime
> > >> information
> > >> > > must
> > >> > > > > be
> > >> > > > > > > > >> available
> > >> > > > > > > > >> > in all nodes.Since the
folder is shared, all nodes
> > will
> > >> > see
> > >> > > > the
> > >> > > > > > > > >> > availability of a new
process. My idea is for
> master
> > >> node
> > >> > to
> > >> > > > > write
> > >> > > > > > > the
> > >> > > > > > > > >> meta
> > >> > > > > > > > >> > data and other nodes to
just read the meta data and
> > >> load
> > >> > > > > > process.So
> > >> > > > > > > we
> > >> > > > > > > > >> need
> > >> > > > > > > > >> > a small delay between
master node deployment and
> > other
> > >> > nodes
> > >> > > > > > > > deployment.
> > >> > > > > > > > >> >
> > >> > > > > > > > >> > Is there anyway to set
the delay between master
> node
> > >> and
> > >> > > > slaves
> > >> > > > > > > until
> > >> > > > > > > > >> > master node finish the
deployment?
> > >> > > > > > > > >> >
> > >> > > > > > > > >> > Thank you
> > >> > > > > > > > >> > Sudharma
> > >> > > > > > > > >> >
> > >> > > > > > > > >> >
> > >> > > > > > > > >> > On 20 May 2015 at 13:01,
Tammo van Lessen <
> > >> > > > tvanlessen@gmail.com
> > >> > > > > >
> > >> > > > > > > > wrote:
> > >> > > > > > > > >> >
> > >> > > > > > > > >> > > Hi Sathwik,
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > > On Wed, May 20, 2015
at 6:40 AM, Sathwik B P <
> > >> > > > > > > sathwik.bp@gmail.com>
> > >> > > > > > > > >> > wrote:
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > > > Sudharma/Tammo,
> > >> > > > > > > > >> > > >
> > >> > > > > > > > >> > > > 1) How do we
plan to decide which is the master
> > >> node
> > >> > in
> > >> > > > the
> > >> > > > > > > > cluster?
> > >> > > > > > > > >> > > >
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > > I think the easiest
approach is to always elect
> the
> > >> > oldest
> > >> > > > > node
> > >> > > > > > in
> > >> > > > > > > > the
> > >> > > > > > > > >> > > cluster to be the
master. AFAIK Hazelcast can
> > easily
> > >> > asked
> > >> > > > for
> > >> > > > > > > this
> > >> > > > > > > > >> > > information.
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > > > 2) Don't we
need to stall the Deployment
> Pollers
> > in
> > >> > the
> > >> > > > > slave
> > >> > > > > > > > nodes?
> > >> > > > > > > > >> > > >
> > >> > > > > > > > >> > > >
> > >> > > > > > > > >> > > Absolutely.
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > > Suggestion:
> > >> > > > > > > > >> > > > I am not sure
whether do we need Master-SLaves.
> > Why
> > >> > not
> > >> > > > give
> > >> > > > > > > every
> > >> > > > > > > > >> node
> > >> > > > > > > > >> > > in
> > >> > > > > > > > >> > > > the cluster
the same status (Active-Active).
> > >> > > > > > > > >> > > >
> > >> > > > > > > > >> > > > When a new deployment
is made, the load
> balancer
> > >> can
> > >> > > push
> > >> > > > it
> > >> > > > > > to
> > >> > > > > > > > any
> > >> > > > > > > > >> of
> > >> > > > > > > > >> > > the
> > >> > > > > > > > >> > > > available nodes.
That node will probably
> acquire
> > a
> > >> > > > > distributed
> > >> > > > > > > > lock
> > >> > > > > > > > >> on
> > >> > > > > > > > >> > > the
> > >> > > > > > > > >> > > > deployment unit
and acts as master for that
> > >> > deployment.
> > >> > > > This
> > >> > > > > > > > ensures
> > >> > > > > > > > >> > > > optimum usage
of the cluster nodes. Probably no
> > >> static
> > >> > > > > > > > >> configuration of
> > >> > > > > > > > >> > > > Master-Slave
in the load balancer nor in the
> > >> > hazelcast.
> > >> > > > > > > > >> > > >
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > > But this would not
allow to have the
> hotdeployment
> > >> via
> > >> > > > > > filesystem
> > >> > > > > > > > >> still
> > >> > > > > > > > >> > > enabled, right?
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > > Best,
> > >> > > > > > > > >> > >   Tammo
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > > --
> > >> > > > > > > > >> > > Tammo van Lessen
- http://www.taval.de
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> >
> > >> > > > > > > > >>
> > >> > > > > > > > >>
> > >> > > > > > > > >>
> > >> > > > > > > > >> --
> > >> > > > > > > > >> Tammo van Lessen - http://www.taval.de
> > >> > > > > > > > >>
> > >> > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message