cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zaeem Arshad <zaeem.ars...@gmail.com>
Subject Re: Hosts going down
Date Tue, 07 Mar 2017 10:28:53 GMT
Are there any errors or log messages on the console? We ran into a number
of issues with XS and SM combinations in the past but with AMDs. BIOS and
Xen updates fixed the issues for us. There were two bugs though that
required patches from Citrix to be fixed. We had to generate core dumps for
that. If you have Citrix support, enable serial console on the servers and
boot them using the xe-serial GRUB entry. When the server hangs the next
time, initiate the core dump from ipmi console to generate a core dump
which can help citrix or the community uncover the issue.

Link:
https://support.citrix.com/article/CTX121442




On Tue, Mar 7, 2017 at 12:12 PM, Makrand <makrandsanap@gmail.com> wrote:

> All,
>
> This is little bit off-topic. More of hardware issue.
>
> For our cloudstack, compute nodes (XENserver 6.2) are running on Supermicro
> SuperServer 5017R-MTF
> <http://www.supermicro.com/products/system/1u/5017/sys-5017r-mtf.cfm>.
> Sometimes in a pool, some server goes down on its own (host appears as
> disconnected from resource pool in XENcenter). We then have to power cycle
> or start it using IPMI. Over last year this has happened with few severs in
> different zones.
>
> I was wondering if anyone using similar super micro servers (as part of
> cloud or virt environments) had faced this issue and knows what could be
> done to fix it?
>
> 1) BIOS update is on my list, but I think for some servers we already have
> latest on latest BIOS. Plus we have near about 60+ hosts in all zones, so
> upgrading BIOS it bit lengthy task to achieve.
>
> 2) FYI, on supermicro recommendation, I've already disabled ACPI sleep
> state from BIOS. (Something like screen shot below:-
> https://snag.gy/rhvbWE.jpg), but don't think that  helped.
>
> 3) Is there any config/setting at XENserver that might be needed to handle
> ACPI properly.
>
>
> --
> Makrand
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message