When deploying a VM is failed during the allocation process it may leave the resources that have been already allocated before the failure. They will get removed from the database as the whole code block is wrapped inside a transaction twice but the server would not inform the network or storage plugins to clean up the allocated resources.
This PR removes Transactions during VM allocation which results in the allocated VM and its resource records being persisted in DB even during failures. When failure is encountered VM is moved to Error state. This helps VM and its resources to be properly deallocated when it is expunged either by a server task such as ExpungeTask or during manual expunge.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
This PR resource throws exception with the correct error code and logs the error message when a resource allocation failure is encountered during resize volume operation.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Supported Virtual machine operations:
- live migration of VM to another host
- virtual machine snapshots (group snapshot without memory)
- revert VM snapshot
- delete VM snapshot
Supported Volume operations:
- attach/detach volume
- live migrate volume between two StorPool primary storages
- volume snapshot
- delete snapshot
- revert snapshot
This PR adds a check on host's status. Without this if the agent is not in Up or Connecting state, expunging of a VM fails.
Steps to reproduce:
- Enable vm.configdrive.force.host.cache.use in Global Configuration.
- Create a L2 network with config drive
- Deploy a vm with the L2 network created in previous step
- Stop the vm and destroy vm (not expunge it)
- Stop the cloudstack-agent on the VM's host
- Expunge the vm
Fixes: #7428
* Live storage migration of volume in scaleIO within same storage scaleio cluster
* Added migrate command
* Recent changes of migration across clusters
* Fixed uuid
* recent changes
* Pivot changes
* working blockcopy api in libvirt
* Checking block copy status
* Formatting code
* Fixed failures
* code refactoring and some changes
* Removed unused methods
* removed unused imports
* Unit tests to check if volume belongs to same or different storage scaleio cluster
* Unit tests for volume livemigration in ScaleIOPrimaryDataStoreDriver
* Fixed offline volume migration case and allowed encrypted volume migration
* Added more integration tests
* Support for migration of encrypted volumes across different scaleio clusters
* Fix UI notifications for migrate volume
* Data volume offline migration: save encryption details to destination volume entry
* Offline storage migration for scaleio encrypted volumes
* Allow multiple Volumes to be migrated with migrateVirtualMachineWithVolume API
* Removed unused unittests
* Removed duplicate keys in migrate volume vue file
* Fix Unit tests
* Add volume secrets if does not exists during volume migrations. secrets are getting cleared on package upgrades.
* Fix secret UUID for encrypted volume migration
* Added a null check for secret before removing
* Added more unit tests
* Fixed passphrase check
* Add image options to the encypted volume conversion
In troubleshooting ops issues we see logs like:
Maximum domain resource limits of Type 'user_vm' for Domain Id = 763 is exceeded: Domain Resource Limit = (1 bytes) 1, Current Domain Resource Amount = (0 bytes) 0, Requested Resource Amount = (1 bytes) 1."
However there is one missing value (currentResourceReservation) that is used in the calculation of limit check but it is not logged, which leads to confusion. Above we see we are using “0” and requested 1, with our limit being 1, but was rejected. Without logging all the values used in the calculation we don’t understand why it failed.
Additionally, if we had this log above it would be clearer that a second bug is occurring. When we query for domain level resource reservations in “getDomainReservation” the actual SearchBuilder is the listAccountAndTypeSearch, not the listDomainAndTypeSearch. As a result, when we call getDomainReservation the query returns any outstanding domain reservation for any account, as domain ID is not a valid filter for the account search.
This PR:
Increases detailed information in log for checking resource limit to include reservations information for functions: checkDomainResourceLimit() and checkAccountResourceLimit
Fixes getDomainReservation() to use listDomainAndTypeSearch instead of listAccountAndTypeSearch
Co-authored-by: Oscar Sandoval <osandovalocana@apple.com>
Fixes#6697
Allows the server to generate started and completed events for VM.CREATE event type when VM is deployed with startvm=false.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
This PR adjusts the DirectDownload certificate check initial delay. Since the time unit is in hours, I think it was a mistake to schedule the initial check to be in 60 hours after management servers start - the intention was likely 60 seconds. We had turned this feature on to run hourly, not realizing we would have to wait 2.5 days to see it first run!
Co-authored-by: Marcus Sorensen <mls@apple.com>
This PR fixes a null pointer edge case where a PowerFlex volume is attached to a VM.
In this edge case, a VM has been created, started, stopped, and then reimaged. VM has a last host ID but the newly attached volume does not yet have a storage pool assigned, it will be assigned on the VM start. However, we assume that if the VM's host is not null, we need to try to revoke access to the volume for the host. Since there is no storage pool yet for the volume, we hit a null pointer when checking to see if the volume's pool is PowerFlex.
This was affecting all storage types, I could reproduce it with local storage, since the null pointer is at the check for pool's type.
Co-authored-by: Marcus Sorensen <mls@apple.com>
This PR allows admin to filter resources by state for systemvms, router & storagepool. This is part of #7366 .
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This PR fixes#7362 and also other search criteria to use the name as an exact search where keyword is also there.
Made UI changes for roles search to make use of keyword instead of name.
This fix ensures when datastore cluster in VMware is added as a primary storage pool in CloudStack then all the child datastores (which already exists in CS) should be in Up state.
For example:
1. Datastore Cluster DS has two child datastores A and B in vCenter. (B is already added as a storage pool in CloudStack)
2. Now try to add datastore cluster DS into CloudStack as a primary storage pool
3. CloudStack tries to add child datastores A and B in CloudStack, since B is already there in CloudStack, it will reuse the existing storagepool entry and will keep under parent Storage pool DS.
During Step 3 we are now checking if B is Up state or not.
Fixes the case when userdata variable value contains '=' sign. This PR considers everything before occurrence of first '=' sign as key and remaining string as value.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>