cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Weingärtner <rafaelweingart...@gmail.com>
Subject Re: Very long time to destroy VM
Date Wed, 13 Apr 2016 18:01:47 GMT
Well, here goes one possible explanation. If I had to bet, I would bet on
this one, and not on some chunk of code that might be synchronized.

When you use the destroy command, first the ACS stops the VM. The stop
process is the one that can be slow. The OS of the VM might have taken time
long time to stop process nicely. If you use the Stop force, or the destroy
command (with expunge=true), I do not remember the whole code; but, ACS use
the “force” flag on the stop command. That tells for the hypervisor simply
to remove the VMM and do not care about the consequences.

Sto, when you use a normal stop/destroy command, you rely on the shutdown
process of the OS and that can be slow sometimes, giving that it will try
to stop all process nicely.

On Wed, Apr 13, 2016 at 11:56 AM, Simon Godard <sgodard@cloudops.com> wrote:

> The stop operation seems to be as quick as usual. Again, we don’t have
> slow destroy on all VMs. It occurred twice in a short time frame but we
> didn’t experience it since then. I just want to understand the root cause
> to see if the management server performance was at fault or if it’s a
> concurrency bug in the code.
>
> --
> Simon
>
> > On Apr 13, 2016, at 10:18, Rafael Weingärtner <
> rafaelweingartner@gmail.com> wrote:
> >
> > If you just use the stop option? Is it taking a long time too?
> >
> > On Wed, Apr 13, 2016 at 10:37 AM, Simon Godard <sgodard@cloudops.com>
> wrote:
> >
> >> We are using XenServer 6.2.
> >>
> >> Most VM destroy (expunge=true) are fairly quick. Is there anything else
> I
> >> could be looking for? At the time of the slow destroy, there weren’t a
> very
> >> high number of async jobs ongoing. I suspect it could be related to a DB
> >> concurrency issue, looking at this log I just found:
> >>
> >> 2016-04-11 15:24:30,777 ERROR
> >> [org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl]
> >> (Work-Job-Executor-17:ctx-4492ef70 job-1480365/job-1480367) Unexpected
> >> exception:
> >> DB Exception on: com.mysql.jdbc.PreparedStatement@3f603c42: UPDATE
> >> sync_queue SET sync_queue.queue_size=0,
> sync_queue.last_updated='2016-04-11
> >> 19:24:30' WHERE sync_queue.id = 905339
> >>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:851)
> >>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:804)
> >>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:1313)
> >>    at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source)
> >>    at
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>    at java.lang.reflect.Method.invoke(Method.java:606)
> >>    at
> >>
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
> >>    at
> >>
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
> >>    at
> >>
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
> >>    at
> >>
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
> >>    at
> >>
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
> >>    at
> >>
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
> >>    at
> >>
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
> >>    at
> >>
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
> >>    at com.sun.proxy.$Proxy164.update(Unknown Source)
> >>    at
> >>
> org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl$4.doInTransactionWithoutResult(SyncQueueManagerImpl.java:197)
> >>    at
> >>
> com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
> >>    at
> >> com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:49)
> >>    at com.cloud.utils.db.Transaction.execute(Transaction.java:37)
> >>    at com.cloud.utils.db.Transaction.execute(Transaction.java:46)
> >>    at
> >>
> org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl.purgeItem(SyncQueueManagerImpl.java:182)
> >>    at
> >>
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:524)
> >>    at
> >>
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> >>    at
> >>
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> >>    at
> >>
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> >>    at
> >>
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> >>    at
> >>
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> >>    at
> >>
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:459)
> >>    at
> >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> >>    at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>    at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>    at java.lang.Thread.run(Thread.java:745)
> >> Caused by:
> >> com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException:
> Deadlock
> >> found when trying to get lock; try restarting transaction
> >>    at sun.reflect.GeneratedConstructorAccessor825.newInstance(Unknown
> >> Source)
> >>    at
> >>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >>    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> >>    at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
> >>    at com.mysql.jdbc.Util.getInstance(Util.java:386)
> >>    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1065)
> >>    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
> >>    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
> >>    at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
> >>    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
> >>    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
> >>    at
> >>
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
> >>    at
> >>
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
> >>    at
> >>
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2333)
> >>    at
> >>
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2318)
> >>    at
> >>
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
> >>    at
> >>
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
> >>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:843)
> >>    ... 32 more
> >> --
> >> Simon
> >>
> >>> On Apr 13, 2016, at 09:20, Rafael Weingärtner <
> >> rafaelweingartner@gmail.com> wrote:
> >>>
> >>> What hypervisor are you using?
> >>>
> >>> Every single VM in your environment is presenting this behavior?
> >>>
> >>>
> >>>
> >>> On Wed, Apr 13, 2016 at 10:18 AM, Simon Godard <sgodard@cloudops.com>
> >> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I am trying to understand why a destroyVirtualMachine API call would
> >> take
> >>>> around 1 hour to get a successful async job result. From CloudStack
> >> log, I
> >>>> can see that the StopVmCmd occurred right away, but the DestroyVmCmd
> >> took 1
> >>>> hour to complete.
> >>>>
> >>>> Do you know what could cause such delays?
> >>>>
> >>>> The only suspicious log entries before the 1 hour delay are:
> >>>>
> >>>> 2016-04-11 15:23:32,896 DEBUG [com.cloud.vm.VirtualMachineManagerImpl]
> >>>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Destroying
> >> vm
> >>>> VM[User|i-2278-11598-VM]
> >>>> 2016-04-11 15:23:32,920 WARN [com.cloud.utils.db.Merovingian2]
> >>>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Was unable
> >> to
> >>>> find lock for the key vm_instance11598 and thread id 1817738544
> >>>>
> >>>> Then 1 hour later:
> >>>> 2016-04-11 16:23:36,839 DEBUG [com.cloud.capacity.CapacityManagerImpl]
> >>>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) VM state
> >>>> transitted from :Stopped to Destroyed with event: DestroyRequestedvm's
> >>>> original host id: 111 new host id: null host id before state
> transition:
> >>>> null
> >>>>
> >>>> Thanks!
> >>>> Simon Godard
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Rafael Weingärtner
> >>
> >>
> >
> >
> > --
> > Rafael Weingärtner
>
>


-- 
Rafael Weingärtner

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message