There are some VM deployment failures happening when multiple VMs are deployed at a time, failures mainly due to NetworkModel code that iterates over all the vlans in the pod. This causes each deployVM thread to hold the global lock on Network longer and cause delays. This delay in turn causes more threads to choose same host and fail since capacity is not available on that host.
Following are some changes required to be done to reduce delays during VM deployments which in turn causes some vm deployment failures when multiple VMs are launched at a time.
In Planner, remove the clusters that do not contain a host with matching service offering tag. This will save some iterations over clusters that dont have matching tagged host
In NetworkModel, do not query the vlans for the pod within the loop. Also optimized the logic to query the ip/ipv6
In DeploymentPlanningManagerImpl, do not process the affinity group if the plan has hostId provided.
CLOUDSTACK-9766 : Executing deleteSnapshot api with already deleted sIf we try to delete the snapshot which is already deleted, then no proper error appears in the log and it just try to delete the snapshot which is already deleted.
Steps to reproduce :
-------
1-create a snapshot
2-delete the snapshot
3-try to delete snapshot which is deleted in step 2
Expected Result
-------------
Result should show proper error message. Request for deleting already deleted snapshot should not be placed.
* pr/1924:
CLOUDSTACK-9766 : Executing deleteSnapshot api with already deleted snapshot does not throw any exception or failure message
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
[4.10] CLOUDSTACK-7985: assignVM in Advanced zone with Security GroupsThis commit contains the following changes:
(1) implementation of assignVM in Advanced zone with Security Groups
(2) keep the default nic on shared network when assignVM
(3) allow migrate vm from/to project;
(4) UI change for selecting account/project/network
* pr/844:
CLOUDSTACK-7985: assignVM in Advanced zone with Security Groups
CLOUDSTACK-7985: keep the default nic on shared network when assignVM
CLOUDSTACK-7985: (1) allow migrate vm from/to project; (2) UI change for selecting account/project/network
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
This closes#1644
* 4.9:
CLOUDSTACK-4858 Honors the snapshot.backup.rightafter configuration variable Unhides snapshot.backup.rightafter from global configuration
CLOUDSTACK-4858 Honors the snapshot.backup.rightafter configuration variable
Unhides snapshot.backup.rightafter from global configuration
If snapshot.backup.rightafter is set to false (defaults to true), snapshots are
not backed up to secondary storage.
This is the same as PR #1644 applied to 4.9, as per @jburwell
* pr/1697:
CLOUDSTACK-4858 Honors the snapshot.backup.rightafter configuration variable Unhides snapshot.backup.rightafter from global configuration
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
CLOUDSTACK-9738: [Vmware] Optimize vm expunge process for instances with vm snapshots## Description
It was noticed that expunging instances with many vm snapshots took a look of time, as hypervisor received as many tasks as vm snapshots instance had, apart from the delete vm task. We propose a way to optimize this process for instances with vm snapshots by sending only one delete task to hypervisor, which will delete vm and its snapshots
## Use cases
1. deleteVMsnapohsot-> no changes to current behavior
2. destroyVM with expunge=false -> no actions to VMsnaphsot is performed at the moment. When VM cleanup thread is executed it will perform the same sequence as (3). If instance is recovered before expunged by the cleanup thread it will remain intact with VMSnapshot chain present
3. destroyVM with expunge=true:
* Vmsnaphsot is marked with removed timestamp and state = Expunging in DB
* VM is deleted in HW
* pr/1905:
CLOUDSTACK-9738: [Vmware] Optimize vm expunge process for instances with vm snapshots
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
(1) add support to create/delete/revert vm snapshots on running vms with QCOW2 format
(2) add new API to create volume snapshot from vm snapshot
(3) delete metadata of vm snapshots before stopping/migrating and recover vm snapshots after starting/migrating
(4) enable deleting of VM snapshot on stopped vm or vm snapshot is not listed in qcow2 image.
(5) enable smoke tests for vmsnaphsots on KVM
- Bump spring-framework version to 4.x and Jetty to version that runs with JDK8
- Bump servet dependency version
- Migrate spring xmls to version 4, fixes schema locations that are 3.0
dependent in various xmls.
- Fix failing tests due to spring upgrade
(Thanks @marcaurele Marc-Aurèle Brothier for fixing them)
* Fix test DeploymentPlanningManagerImplTest
* Fix GloboDNS test
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Introduced a global configuration flag 'cluster.threshold.enabled'. By default the flag is true.
If the value is false, then a VM can be started in a cluster even if the cluster thresholds are
crossed. However, for a new VM deployment the cluster threshold will always be honoured.
A constructor signature has changed between 4.8 and 4.9+ branches which caused
failure in a unit test introduced by PR #1694. This fixes the unit test by
passing null as the additional parameter (the test does not need instantiated
object).
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-9509: Host Connects Without StorageKVM hosts on shared storage failure was accepted by mgmt server with the
host state as Up, even though there was no primary/shared storage available on
it. This patch offers a quick fix by throwing an exception in the storage monitor
which connects storage pool on host. The failure is trapped by agent manager
that disconnects the agent without any investigation.
Based on Lab tests, KVM agent may take upto 2 minutes to attempt NFS mount when
the storage is inaccessible (firewalled, or shutdown) before returning back with
an error. It is safe to assume that this won't add pressure on mgmt server due to
several reconnection attempts, and KVM agent would retry reconnection every 2
minutes.
For such KVM hosts, where failure happens due to storage issues; they will be
briefly put in Alert state but will be mostly be in Connecting state during which
the KVM host attempts to mount/reconfigure NFS storage pool.
/cc @jburwell @karuturi
@blueorangutan package
* pr/1694:
CLOUDSTACK-9509: Host Connects Without Storage
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This PR adds an ability to Pass a new parameter, locationType,
to the “createSnapshot” API command. Depending on the locationType,
we decide where the snapshot should go in case of managed storage.
There are two possible values for the locationType param
1) `Standard`: The standard operation for managed storage is to
keep the snapshot on the device. For non-managed storage, this will
be to upload it to secondary storage. This option will be the
default.
2) `Archive`: Applicable only to managed storage. This will
keep the snapshot on the secondary storage. For non-managed
storage, this will result in an error.
The reason for implementing this feature is to avoid a single
point of failure for primary storage. Right now in case of managed
storage, if the primary storage goes down, there is no easy way
to recover data as all snapshots are also stored on the primary.
This features allows us to mitigate that risk.
CLOUDSTACK-9438: Fix for CLOUDSTACK-9252 - Make NFS version changeable in UIJIRA TICKET: https://issues.apache.org/jira/browse/CLOUDSTACK-9438
### Introduction
From #1361 it was possible to configure NFS version for secondary storage mount.
However, changing NFS version requires inserting an new detail on `image_store_details` table, with `name = 'nfs.version'` and `value = X` where X is desired NFS version, and then restarting management server for changes to take effect.
Our improvement aims to make NFS version changeable from UI, instead of previously described workflow.
### Proposed solution
Basically, NFS version is defined as an image store ConfigKey, this implied:
* Adding a new Config scope: **ImageStore**
* Make `ImageStoreDetailsDao` class to extend `ResourceDetailsDaoBase` and `ImageStoreDetailVO` implement `ResourceDetail`
* Insert `'display'` column on `image_store_details` table
* Extending `ListCfgsCmd` and `UpdateCfgCmd` to support **ImageStore** scope, which implied:
** Injecting `ImageStoreDetailsDao` and `ImageStoreDao` on `ConfigurationManagerImpl` class, on `cloud-server` module.
### Important
It is important to mention that `ImageStoreDaoImpl` and `ImageStoreDetailsDaoImpl` classes were moved from `cloud-engine-storage` to `cloud-engine-schema` module in order to Spring find those beans to inject on `ConfigurationManagerImpl` in `cloud-server` module.
We had this maven dependencies between modules:
* `cloud-server --> cloud-engine-schema`
* `cloud-engine-storage --> cloud-secondary-storage --> cloud-server`
As `ImageStoreDaoImpl` and `ImageStoreDetailsDao` were defined in `cloud-engine-storage`, and they needed in `cloud-server` module, to be injected on `ConfigurationManagerImpl`, if we added dependency from `cloud-server` to `cloud-engine-storage` we would introduce a dependency cycle. To avoid this cycle, we moved those classes to `cloud-engine-schema` module
* pr/1615:
CLOUDSTACK-9438: Fix for CLOUDSTACK-9252 - Make NFS version changeable in UI
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
Currently the logic about volume deletion seems to be that an event
should be emitted when the volume delete is requested, not when the
deletion completes.
The VolumeStateListener specifically ignores destroy events for ROOT
volumes, assuming that the ROOT volume only gets deleted when the
instance is destroyed and the UserVmManager should take care of it.
When deleting an account, all of its resources get destroyed, but the
instance expunging circumvents the UserVmManager, and thus we miss the
VOLUME_DESTROY usage event. The account manager now attempts to
propperly destroy the vm before expunging it. This way the destroy
logic is respected, including the event emission.
KVM hosts on shared storage failure was accepted by mgmt server with the
host state as Up, even though there was no primary/shared storage available on
it. This patch offers a quick fix by throwing an exception in the storage monitor
which connects storage pool on host. The failure is trapped by agent manager
that disconnects the agent without any investigation.
Based on Lab tests, KVM agent may take upto 2 minutes to attempt NFS mount when
the storage is inaccessible (firewalled, or shutdown) before returning back with
an error. It is safe to assume that this won't add pressure on mgmt server due to
several reconnection attempts, and KVM agent would retry reconnection every 2
minutes.
For such KVM hosts, where failure happens due to storage issues; they will be
briefly put in Alert state but will be mostly be in Connecting state during which
the KVM host attempts to mount/reconfigure NFS storage pool.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Unhides snapshot.backup.rightafter from global configuration
If snapshot.backup.rightafter is set to false (defaults to true), snapshots are
not backed up to secondary storage
Added checks to prevent netwrok update when router state is unknown or when
the new offering removes a service that is in use.
Added a new param forced to the updateNetwork API. The network will
undergo a forced update when this param is set to true.
CLOUDSTACK-8751 Clean network config like firewall rules etc, when network services are removed during network update.
Emit template UUID and class type over event bus when deleting templatesNew version of #1378 for the 4.7b branch instead of 4.6.
* pr/1564:
Emit template UUID and class type over event bus when deleting templates.
Signed-off-by: Will Stevens <williamstevens@gmail.com>
- Restricts default login auth handler to ldap and native-cloudstack users
- Refactors and create re-usable method to find domain by id/path
- Adds unit test for refactored method in DomainManagerImpl
- Adds smoke test for login handler
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
The behavior is now consistent with template creation. This commit
also adds a unit test for this functionality to make sure that it will
always happen.
CLOUDSTACK-9368: Fix for Support configurable NFS version for Secondary Storage mounts## Description
JIRA TICKET: https://issues.apache.org/jira/browse/CLOUDSTACK-9368
This pull request address a problem introduced in #1361 in which NFS version couldn't be changed after hosts resources were configured on startup (for hosts using `VmwareResource`), and as host parameters didn't include `nfs.version` key, it was set `null`.
## Proposed solution
In this proposed solution `nfsVersion` would be passed in `NfsTO` through `CopyCommand` to `VmwareResource`, who will check if NFS version is still configured or not. If not, it will use the one sent in the command and will set it to its storage processor and storage handler. After those setups, it will proceed executing command.
* pr/1518:
CLOUDSTACK-9368: Fix for Support configurable NFS version for Secondary Storage mounts
Signed-off-by: Will Stevens <williamstevens@gmail.com>