Merge pull request #895 from SudharmaJain/cs-8911

CLOUDSTACK-8911: VM start job got stuck in loop looking for suitable host

VM instance creation job get stuck in the loop, when VMs require local storage there are host that reached max guest limit and remain hosts does have storage available.  This happens because the hosts that reach the max guest limit were not getting added to the avoid list and hence the cluster.

Verified the fix on my local setup.

Repro Steps:
1. Take an environment with single cluster and 2 hosts.
2. change the max guest limit for the hypervisor such that on one host max guest limit should reach.
3. change thresholds so that other host should not have enough storage. If required create a VM for sufficient bigger disk.
4. Now deploy a VM with local storage.
5. cluster will never be put in the avoid set and job will keep looking for suitable host.
6. once we increase the max guest limit, VM will deploy or will fail if there is a lack of storage.

* pr/895:
  CLOUDSTACK-8911: VM start job got stuck in loop looking for suitable host

Signed-off-by: Remi Bergsma <github@remi.nl>
This commit is contained in:
Remi Bergsma 2015-10-28 11:29:32 +01:00
commit 7d46b2ee56

View File

@ -297,6 +297,7 @@ public class FirstFitAllocator extends AdapterBase implements HostAllocator {
s_logger.debug("Host name: " + host.getName() + ", hostId: " + host.getId() +
" already has max Running VMs(count includes system VMs), skipping this and trying other available hosts");
}
avoid.addHost(host.getId());
continue;
}
@ -305,6 +306,7 @@ public class FirstFitAllocator extends AdapterBase implements HostAllocator {
ServiceOfferingDetailsVO groupName = _serviceOfferingDetailsDao.findDetail(serviceOfferingId, GPU.Keys.pciDevice.toString());
if(!_resourceMgr.isGPUDeviceAvailable(host.getId(), groupName.getValue(), offeringDetails.getValue())){
s_logger.info("Host name: " + host.getName() + ", hostId: "+ host.getId() +" does not have required GPU devices available");
avoid.addHost(host.getId());
continue;
}
}