When the static route service is not available on the VPC and a static route is created, the static route is created in a revoked state.
Currently, the UI doesn't distinguish between active or revoked static routes.
This PR adds the missing state filter to the list routes command and only lists active routes in the UI.
It also ignores revoked routes when the private gateway is being removed but clears out the inactive routes before the gateway is removed.
Fixes#2908
While migrate a vm, in the popup, the host dedicated to other accounts/domains are also 'Suitable" for migration, which is obviously wrong.
The same issue happens with api findHostsForMigration
This PR fixes an issue where an instance fails to deploy due to a null pointer when using an L2 Guest Network with DefaultL2NetworkOfferingConfigDrive on Xenserver. It also fixes migrating an instance to another host.
This has been tested by:
- Creating an L2 Guest network, using DefaultL2NetworkOfferingConfigDrive as the network offering.
- Deploying an instance using the L2 Guest network created.
- Migrating the instance away from the host and back
Root cause:
Even though dynamic scaling job is handled in vmworkjob queue which ensures serilizing multiple jobs but the database updating and generating usage events are out of the job queue.
Solution:
Moved all updations into the job queue
Firstly I have tested all the scenarios to check if nothing is broken:
Scaling on a running VM with normal compute offering
Scaling on a stopped VM with normal compute offering
Scaling on a running VM with custom compute offering
Scaling on stopped VM with custom compute offering
Scaling on stopped/running VM between custom compute offering and normal compute offering and combinations among these. Checked if the custom parameters have been populated or deleted accordingly based on the offering to which the VM is scaled
Since this is a corner scenario I could not test the exact point where two usage events are recorded at the same time for two different API calls on same VM.
Fixes#4090
When trying to migrate a VM across 2 clusters, if a snapshot has been deleted and garbage collection has run to update the removed field, it is not possible to migrate the instance due to a null pointer.
When expunge a Running vm, vm will be stopped with forcestop=false which does not make sense. we should honor vm.destroy.forcestop in global setting, or always set forcestop=true.
* Update AncientDataMotionStrategy.java
fix When secondary storage usage is> 90%, VOLUME migration across primary storage will cause the migration to fail and lose VOLUME
* Update AncientDataMotionStrategy.java
Volume is migrated across Primary storage. If no secondary storage is available(Or used capacity> 90% ), the migration is canceled.
Before modification, if secondary storage cannot be found, copyVolumeBetweenPools return NUll
copyAsync considers answer = null to be a sign of successful task execution, so it deletes the VOLUME on the old primary storage. This is the root cause of data loss, because VOLUME did not perform the migration at all.
* code in comment removed
Co-authored-by: div8cn <35140268+div8cn@users.noreply.github.com>
Co-authored-by: Daan Hoogland <dahn@onecht.net>
* Fixes snapshot deletion
* Remove legacy '@Component', it is not necessary in this bean/class.
* Fix log message missing %d and remove snapshot on DB
* Remove "dummy" boolean return statement
* Manage snapshot deletion for KVM + NFS (primary storage)
* checkstyle trailing spaces
* rename options strings to *_OPTION
* Fix typo on deleteSnapshotOnSecondaryStorage and enhance log message
* Move the snapshotDao.remove(snapshotId); (#4006)
* Fix deletesnapshot worflow to handle both snapshots created in primary storage and snapshots backed up to secondary storage
* Fix extra space
* refactor out separate handling methods for secondary and primary (reducing returns)
* return false on unexpected error or log when expected
* != instead of ==
* secondary instead of backup storage
* init to null
* Handle snapshot deletion on primary storage. When primary store ref not found for snapshot do not fail the operation.
* Fix debug levels on log messages
Co-authored-by: GabrielBrascher <gabriel@apache.org>
Co-authored-by: Andrija Panic <45762285+andrijapanicsb@users.noreply.github.com>
Co-authored-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
Co-authored-by: nvazquez <nicovazquez90@gmail.com>
As previously described by PR #3929:
If vm has attached ISO, the migration fails with error message "org.libvirt.LibvirtException: Cannot access storage file /mnt/b33e5a1d-e4ea-3465-b6ac-c98dc8ff8af0/207-2-cc5fd717-2d57-3bb3-bcf6-2c930268db6c.iso"
Fixes#3191
When a template is registered, code stores md5sum of the downloaded file in the vm_template table. However, this downloaded file could be deleted after template installation if it is not an actual (.qcow2, .ova, etc.) file. When the user copies a template using copyTemplate API, the actual template file will be copied across the image stores. Matching checksum for the copied templated file and the stored value from the vm_template table will result in a mismatch.
Changes will set an empty checksum value for the copied template while passing to download service which allows skipping wrong checksum check for the copied while install.
However, this results in a change in checksum value for concerned template entry in vm_template table post template install.
Co-authored-by: dahn <daan.hoogland@gmail.com>
When start a vm or migrate a vm (away from a host in host maintenance), cloudstack will check capacity of all hosts and choose one. If there are hundreds of hosts on the platform, it will take some seconds. When cloudstack choose a host and start/migrate vm to it, the resource consumption of the host might have been changed. This normally happens when we start/migrate multiple vms.
It would be better to double check the host capacity when start vm on a host.
This PR includes the fix for cpucore capacity when start/migrate a vm.
When we calculate a resource consumption of a host, we need to take the vms in following states into calculation: Running, Starting, Stopping, Migrating (to the host), and vms are Migrating from the host. Because, when stop a vm, the resource on host will be released when vm is stopped. When migrate a vm, the resource on destination host will be increased before migration starts, and resource on source host will be decreased after migraiton succeeds.
In cloudstack, there is a task named CapacityChecked which run every 5 minutes (capacity.check.period =300000 ms by default). It recalculates capacity of all hosts. However, it takes only vms in Running and Starting into consideration. We have faced some issues in host maintenance due to it.
Steps to reproduce the issue
(1) migrate N vms from host A to host B, cpu/ram resource increases before the migration.
(2) capacity check recalculate the capacity of hosts. used capacity of Host B will be reset to original value (not including the vms in Migrating).
(3) migrate some more vms from other host to host B, the migrations are allowed by cloudstack (because used capacity is incorrect). If the actual used memory exceed the physical memory on the host, there might be some critical issues (for example, libvirt dies)
Diagnostics are hard when a snapshot fails if a null pointer occurs. This is because no stack trace or location of the error is logged. I.E.
2019-10-21 12:55:00,056 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-131:ctx-80420156 job-10033827/job-10033828 ctx-4864e2f5) (logid:21454564) copy snasphot failed: java.lang.NullPointerException
* server: Do NOT cleanup dhcp and dns when stop a vm
According comment in PR #3608, dhcp and dns entries are cleaned up only when a VM is expunged.
Revert part of commit 8fb388e9312b917a8f36c7d7e3f45985a95ce773.
* server: cleanup dns/dhcp entries in removeNic instead of finalizeExpunge
When a network IP range is removed, the "vlan" stays mapped on pod_vlan_map; therefore, the method that lists the VLANs by pod id will return null VLANS.
This PR adds proper verifications to avoid null pointer exception when deploying VRs on a pod with removed VLANs. The exception was caused on getPlaceholderNicForRouter.
Problem: In Vmware, appliances that have options that are required to be answered before deployments are configurable through vSphere vCenter user interface but it is not possible from the CloudStack user interface.
Root cause: CloudStack does not handle vApp configuration options during deployments if the appliance contains configurable options. These configurations are mandatory for VM deployment from the appliance on Vmware vSphere vCenter. As shown in the image below, Vmware detects there are mandatory configurations that the administrator must set before deploy the VM from the appliance (in red on the image below):
Solution:
On template registration, after it is downloaded to secondary storage, the OVF file is examined and OVF properties are extracted from the file when available.
OVF properties extracted from templates after being downloaded to secondary storage are stored on the new table 'template_ovf_properties'.
A new optional section is added to the VM deployment wizard in the UI:
If the selected template does not contain OVF properties, then the optional section is not displayed on the wizard.
If the selected template contains OVF properties, then the optional new section is displayed. Each OVF property is displayed and the user must complete every property before proceeding to the next section.
If any configuration property is empty, then a dialog is displayed indicating that there are empty properties which must be set before proceeding
image
The specific OVF properties set on deployment are stored on the 'user_vm_details' table with the prefix: 'ovfproperties-'.
The VM is configured with the vApp configuration section containing the values that the user provided on the wizard.
Fix regression bug that affects KVM local storage migration. Some of the desired execution flows for KVM local storage migration had been altered to allow only managed storage to execute. Fixed allowing managed and non managed storages to execute.
Fixes#3521
Retrieval of an image store using ImageStoreProviderManager has been refactored by introducing three different methods,
DataStore getRandomImageStore(List<DataStore> imageStores);
To get an image store for reading purpose. Threshold capacity check will not be used here.
DataStore getImageStoreWithFreeCapacity(List<DataStore> imageStores);
To get an image store for reading purpose. Threshold capacity check will be used here and the store with max free space will be returned. If no store with filled storage less than the threshold is found, the NULL value will be returned.
List<DataStore> listImageStoresWithFreeCapacity(List<DataStore> imageStores);
To get a list of image stores for writing purpose which fulfills threshold capacity check.
Correspondingly DataStoreManager methods have been refactored to return similar values for a given zone.
Fixes#3287 - NULL value will be returned when secondary storage is needed for writing but there is not store with free space.
Fixes#3041 - Rather than returning random secondary storage for writing, storage with max. free space will be returned.
Fixes#3478 - For migration on VMware, all writable secondary storage will be mounted while preparation.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Currently when refreshing disk usage stats all kvm agents are asked to collect stats for all volumes. In setups with multiple kvm hosts where managed storage is used, not all volumes are attached to all kvm hosts, this results in a large number of warnings in the kvm agent logs. This change introduces a filter step in case managed storage is used so that the management server only requests kvm agents for stats about volumes that are connected to each kvm host.
Fixes checkstyle issue caused by previous commit 6a511fce40f5363d71c5515df9841c2493c73c31
from PR #3466 where a minor review fix did not address this. Merging this
one to unblock few other PRs after running a local build test.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Add CephSnapshotStrategy to handle RBD revert (rollback) snapshot. In order to support RBD revert (rbd_rollback), this PR adds a CephSnapshotStrategy class to handle Ceph/RBD snapshot actions.
* Add revoke certificates API
* Add background task to sync certificates
* Fix marvin test and revoke certificate
* Fix certificate sent to hypervisor was missing headers
* Fix background task for uploading certificates to hosts