cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Özhan Rüzgar Karaman <oruzgarkara...@gmail.com>
Subject Re: KVM HA is broken, let's fix it
Date Mon, 19 Oct 2015 08:47:11 GMT
Hi;
This IPMI fencing is the technology where most of cloud providers like
OVirt use, so its good. How could we test this IPMI Fencing feature, where
could i find its scripts and its usage/test documents? I have some test
hardwares and i really like to try it.

Thanks
Özhan

On Sat, Oct 17, 2015 at 2:44 AM, ilya <ilya.mailing.lists@gmail.com> wrote:

> Please see another thread on DEV that proposes the fix for KVM HA ->
> [DISCUSS] KVM HA with IPMI Fencing
>
>
> ----
>
> We propose the following solution that in our understanding should cover
> all use cases and provide a fencing mechanism.
>
> NOTE: Proposed IPMI fencing, is just a script. If you are using HP
> hardware with ILO, it could be an ILO executable with specific
> parameters. In theory - this can be *any*  script not just IPMI.
>
> Please take few minutes to read this through, to avoid duplicate efforts...
>
>
> Proposed FS below:
> ----------------
>
>
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/KVM+HA+with+IPMI+Fencing
>
>
> On 10/12/15 12:54 AM, Frank Louwers wrote:
> >
> >> On 10 Oct 2015, at 12:35, Remi Bergsma <RBergsma@schubergphilis.com>
> wrote:
> >>
> >> Can you please explain what the issue is with KVM HA? In my tests, HA
> starts all VMs just fine without the hypervisor coming back. At least that
> is on current 4.6. Assuming a cluster of multiple nodes of course. It will
> then do a neighbor check from another host in the same cluster.
> >>
> >> Also, malfunctioning NFS leads to corruption and therefore we fence a
> box when the shared storage is unreliable. Combining primary and secondary
> NFS is not a good idea for production in my opinion.
> >
> > Well, it depends how you look at it, and what your situation is.
> >
> > If you use 1 NFS export als primary storage (and only NFS), then yes,
> the system works as one would expect, and doesn’t need to be fixed.
> >
> > However, HA is “not functioning” in any of these scenario’s:
> >
> > - you don’t use NFS as your only primary storage
> > - you use more than one NFS primary storage
> >
> > Even worse: imagine you only use local storage as primary storage, but
> have 1 NFS configured (as the UI “wizard” forces you to configure one). You
> don’t have any active VM configured on the primary storage. You then
> perform maintenance on the NFS storage, and take it offline…
> >
> > All your hosts will then reboot, resulting in major downtime, that’s
> completely unnecessary. There’s not even an option to disable this at this
> point… We’ve removed the reboot instructions from the HA script on all our
> instances…
> >
> > Regards,
> >
> > Frank
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message