CLOUDSTACK-10309: Add option on if to VM HA power-on a OOB-shut-off-VM (#2473)

When a user shuts down their VM from the guest OS (and VM HA is enabled), the VM just powers itself back on. Our environment is on KVM hosts.

CloudStack does not know the difference between a VM failing or being shutdown from within the guest OS.

This is a major pain point for all our users - especially since they don't pay for VMs when they are shutoff. It is not intuitive for end-users to understand why they can't shutdown VMs from within the guest OS. Especially when they all come from (non-cloudstack) VMware and Hyper-V environments where this is not an issue.

However, if a host fails, we need VM HA to still work.

This PR that creates a configuration option "ha.vm.restart.hostup". With this option set to false, if CloudStack sees a VM shutdown out-of-band, but the host it was on is still online, then it won't power the VM back on. The logic is that since the host is online, it was most likely shutdown from the guest OS.

For when a host actually fails, standard VM HA logic takes over and powers on VMs (if they have VM HA enabled) if the host they were on fails.

If that "ha.vm.restart.hostup" option is true (the default to match current functionality), it works like always, and even in-guest shutdowns of VMs causes CloudStack to power back on the VM.
This commit is contained in:
Slair1 2018-05-21 02:43:38 -05:00 committed by Rohit Yadav
parent 7e6fddb7ab
commit f23278a438

View File

@ -382,6 +382,9 @@ public class VirtualMachineManagerImpl extends ManagerBase implements VirtualMac
Integer.class, "vm.job.report.interval", "60",
"Interval to send application level pings to make sure the connection is still working", false);
static final ConfigKey<Boolean> HaVmRestartHostUp = new ConfigKey<Boolean>("Advanced", Boolean.class, "ha.vm.restart.hostup", "true",
"If an out-of-band stop of a VM is detected and its host is up, then power on the VM", true);
ScheduledExecutorService _executor = null;
protected long _nodeId;
@ -4015,7 +4018,7 @@ public class VirtualMachineManagerImpl extends ManagerBase implements VirtualMac
public ConfigKey<?>[] getConfigKeys() {
return new ConfigKey<?>[] {ClusterDeltaSyncInterval, StartRetry, VmDestroyForcestop, VmOpCancelInterval, VmOpCleanupInterval, VmOpCleanupWait,
VmOpLockStateRetry,
VmOpWaitInterval, ExecuteInSequence, VmJobCheckInterval, VmJobTimeout, VmJobStateReportInterval, VmConfigDriveLabel};
VmOpWaitInterval, ExecuteInSequence, VmJobCheckInterval, VmJobTimeout, VmJobStateReportInterval, VmConfigDriveLabel, HaVmRestartHostUp};
}
public List<StoragePoolAllocator> getStoragePoolAllocators() {
@ -4160,7 +4163,7 @@ public class VirtualMachineManagerImpl extends ManagerBase implements VirtualMac
case Stopped:
case Migrating:
s_logger.info("VM " + vm.getInstanceName() + " is at " + vm.getState() + " and we received a power-off report while there is no pending jobs on it");
if(vm.isHaEnabled() && vm.getState() == State.Running && vm.getHypervisorType() != HypervisorType.VMware && vm.getHypervisorType() != HypervisorType.Hyperv) {
if(vm.isHaEnabled() && vm.getState() == State.Running && HaVmRestartHostUp.value() && vm.getHypervisorType() != HypervisorType.VMware && vm.getHypervisorType() != HypervisorType.Hyperv) {
s_logger.info("Detected out-of-band stop of a HA enabled VM " + vm.getInstanceName() + ", will schedule restart");
if(!_haMgr.hasPendingHaWork(vm.getId())) {
_haMgr.scheduleRestart(vm, true);