cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Young <iyo...@ratespecial.com>
Subject Re: services not running after reboot
Date Fri, 10 Oct 2014 23:33:25 GMT
I've restarted all the services and restarted the servers too.  The SSVM
and CP start with no trouble.  Every time I try to start or create an
instance, I see repeated messages like these:

/var/log/cloudstack/agent/cloudstack-agent.out:
2014-10-10 16:27:21,841{GMT} WARN  [kvm.resource.LibvirtComputingResource]
(Script-8:) Interrupting script.
2014-10-10 16:27:21,841{GMT} WARN  [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-4:) Timed out:
/usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n
r-19-VM -p
%template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
.  Output is:

/var/log/cloudstack/agent/security_group.log:
2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next time!

On Fri, Oct 10, 2014 at 3:04 PM, Ian Young <iyoung@ratespecial.com> wrote:

> I tried to restart the network with the "clean up" option, via the web
> console.  After several minutes, it failed to restart the network.  The
> SSVM and CP are still running but the VR no longer exists.  Why would these
> be able to start but not the virtual router?
>
> On Fri, Oct 10, 2014 at 2:48 PM, Ian Young <iyoung@ratespecial.com> wrote:
>
>> I restarted the libvirtd service and the management service is now fully
>> started (there are services listening on ports 8250 and 9090).  The SSVM
>> health check script now reports no problems.
>>
>> However, I tried starting an instance and both the instance and the
>> virtual router are in a "starting" state but have been so for almost 10
>> minutes.  In the catalina.out log I see:
>> INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
>> There is pending job or HA tasks working on the VM. vm id: 4, postpone
>> power-change report by resetting power-change counters
>> INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
>> There is pending job or HA tasks working on the VM. vm id: 13, postpone
>> power-change report by resetting power-change counters
>>
>> I'm also seeing this in the agent.log:
>> 2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
>> (Script-6:null) Interrupting script.
>> 2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
>> (agentRequest-Handler-2:null) Timed out:
>> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
>> -n r-4-VM -p
>> %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
>> lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
>> .  Output is:
>>
>> And in the security_group.log:
>> 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time!
>> 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time!
>>
>> What does this mean?
>>
>> On Fri, Oct 10, 2014 at 2:11 PM, Ian Young <iyoung@ratespecial.com>
>> wrote:
>>
>>> This morning I was unable to start new instances.  I discovered that I
>>> could SSH into the SSVM and the console proxy but not the virtual router.
>>> Something strange was happening so I thought it might be a good time to
>>> gracefully stop all the instances and reboot the hypervisor to see if the
>>> VR would start working again.  I also rebooted the management server (a
>>> separate machine) to have a clean slate.  Now that they've both been
>>> rebooted, the following symptoms exist:
>>>
>>> * On the management server, there is no services listening on 9090 or
>>> 8250.
>>> * When I run the SSVM health check script, it says NFS is not currently
>>> mounted.
>>> * The management server log is reporting that Zone 1 is not ready to
>>> launch SSVM/CP yet, even though both of those are running.
>>>
>>> The NFS server is running just fine.  I can mount it in the management
>>> server with no problems.  I've restarted cloudstack-management and
>>> cloudstack-agent but the problems persist.  The "not ready to launch
>>> SSVM/CP yet" messages sounds like the management server and the hypervisor
>>> are not communicating or some information about the system state is out of
>>> sync.  How can I confirm this?
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message